event on the ME. The events we're likely to see are LINK (the ME was
added to the match list), PUT (weird to see first, but means that the ME
was linked to the match list and then matched), or PUT_OVERFLOW, meaning
the message was unexpected.
This commit was SVN r26199.
together, so implement all functions in the MTL interface for all
MTLs. The only places NULL was still being set was for add_comm/del_comm,
and matched probe, both of which are straight forward to implement (or
return ERROR_NOT_IMPLEMENTED, since the PML can't emulate matched probe).
This commit was SVN r26194.
the ompi_message_t structure to properly initialize convertor (the peer
is available in the request in OB1, and wasn't needed when I did the
original implementation).
* Implement matched probe for the Portals4 MTL and add NULL function pointers
for the other MTLs.
* Add add_comm and del_comm functions to portals4 MTL so that direct call
almost works again.
* Add NEWS item that we've implemented matched probe
This commit was SVN r26180.
Note that the previous patch allowed the following test to -pass-:
ompi-tests/mpich_tester/mpich_pt2pt/truncmult.c
This patch makes that test -fail- due to the assumption that MPI_Wait will update the status.MPI_ERROR field. In Open MPI we do not do this, so the MPI_ERROR field being inspected will remain set to MPI_ERR_PENDING. See comments in req_wait.c for why we do this.
If we change the test to not inspect the MPI_ERROR field after calling MPI_Wait successfully, then the test would pass correctly with this patch.
This change was made per discussion on the below email thread:
http://www.open-mpi.org/community/lists/devel/2012/03/10753.php
This commit was SVN r26177.
The following SVN revision numbers were found above:
r26172 --> open-mpi/ompi@03a33417d5
* split eq into send and receive eqs so that we can control the number
of outstanding events in send eq and ensure we never lose an ack
* Shouldn't ever truncate on short unexpected receive bocks, so don't set
the truncate bit
* Track active vs. waiting for free short unexpected receive blocks so
to ensure an active short unexpected receive block is posted coming out
of flow control. Also allow creation of "temporary" blocks which should
be released once FREE event is received.
* Slight reorganization of some code in preparation for more flow control
work.
This commit was SVN r26174.
- configure:
- changed default CUPTI library path to $CUPTI-DIR/lib64
- VT Libs:
- corrected prototype of MPI_Get_address in Fortran MPI wrappers (the second parameter should be an MPI_Aint* instead of MPI_Fint*)
- temporary removed MPI_<Comm|Type|Win>_<get|set>_attr and MPI_Attr_<get|put> from the Fortran MPI wrappers due to missing conversion of the attribute value parameter
- Docu:
- latex doc \usepackage[T1]{fontenc} so that _ can be searched and copied
- smaller font in Environment Variables section
- some improvements in CUDA section
- removed GPU idle time as official feature for CUPTI tracing method
This commit was SVN r26161.
- general:
- added missing entry in ChangeLog
- vtunify[-mpi]:
- fixed possibly uninitialized global token for the predefined Node and "All" process groups
This commit was SVN r26147.
- general:
- corrected OTF version number
- otfprofile:
- removed leading '=' from CSV lines to make it loadable into spreadsheets (e.g. Open Office)
- fixed process naming in CSV output of collective operation statistics
Changes to VT:
- configure:
- added *_FOR_BUILD variables to CrayXE's default configure options; required for cross-building
- VT libs:
- fixed GPU communication, due to new process ID splitting
- fixed parsing of PAPI native events in VT_METRICS; use strtok_r instead of strtok which is successively called inPAPI_event_name_to_code
- added VT_METRICS_SEP to definition comments (-> Vampir's trace info)
- Docu:
- fixed link to TAU Reference Guide
This commit was SVN r26137.
This feature can be enabled at compile time with --with-cma passed
to configure.
At runtime it is also necessary to add "--mca btl btl_sm_use_cma 1"
to the mpirun command.
If both CMA and KNEM are compiled in and enabled at runtime then
KNEM will take precedence and CMA will disable itself
This commit was SVN r26134.
Changes to OTF:
- general:
- updated copyright information (2011->2012)
- otfmerge-mpi:
- use the MPI-2 versions of MPI_Address and MPI_Type_struct
- otfdump:
- don't abort when reading events fails - the input tracefile might only have statistics
Changes to VT:
- general:
- updated version number to 5.12.2openmpi
- updated copyright information (2011->2012)
- configure:
- added configure switches to enable/disable CUPTI and CUDA wrapping
- fixed detection of C++ runtime libraries for Cray and PGI v11.x compilers
- fixed detection of Cray compiler's OpenMP flag
- fixed detection of MPI_IN_PLACE
- disable support for RTLD_DEFAULT on CrayX? platforms; it's provided by dlfcn.h but not working
- added '-force_flat_namespace' to linker flags of compiler wrappers on MacOS (causes that OpenMPI's libmpi_f77 calls the VT MPI wrapper functions - not the original ones)
- default configure options on Cray platforms: use compiler option '--target=$XTPE_COMPILE_TARGET' only if the environment variable is set
- VT libs:
- added support for CUDA tracing via CUPTI callbacks and activities (runtime and driver API, kernels, memory copies, GPU idle time and GPU memory usage)
- added support for cudaMemcpyDefault and synchronous peer-to-peer memory copies in CUDA library wrapper
- fixed a bug in CUDA runtime wrapper initialization and thread creation
- fixed a build bug occurred if CUDA and CUPTI found, but support for library tracing is disabled
- use stack-allocated char-array when composing vtunify command; on some platforms system() results in exit code 127 when using a dynamically allocated char-array
- fixed bug in async. counter plugin
- fixed handling of empty MPI groups (MPI_GROUP_EMPTY)
- fixed handling of MPI groups implicitly generated by MPI_Win_create
- fixed conversion from MPI_Fint-arrays to MPI_Aint-arrays in Fortran MPI wrappers
- fixed order of OpenMP threads based on its id (omp_get_thread_num)
- fixed parsing of filter file to consider non-rank-specific filter rules appearing after a rank selection for disabling
- fixed handling of 'errno' in LIBC[-I/O] wrappers for statically linked applications (set application's errno to the errno defined in the external LIBC which is used for calling the real functions)
- suppress warnings about usage of deprecated MPI functions (OMPI_WANT_MPI_INTERFACE_WARNING=0)
- vtunify[-mpi]:
- fixed potential memory corruption during enqueuing recv. messages for p2p message matching
- vtunify-mpi:
- use the MPI-2 versions of MPI_Address and MPI_Type_struct
- removed unused MPI wrappers
- fixed assertion in p2p message matching which occurred when processing local traces with disabled ranks
- vtdyn:
- load user-specified shared libraries (-s SHLIB) into the mutatee before starting the instrumentation; adds support for instrumenting shared libraries which are loaded during runtime
- compiler wrappers:
- fixed detection of MPI library linked in the path form (e.g. libmpi.a instead of -lmpi)
- fixed corrupt library order when using vtnvcc for linking MPI/CUDA mixed program
- OPARI:
- fixed Fortran parsing for detecting end of block DO loops
This commit was SVN r26114.
- MAJOR! get src descriptor leaks if mca_bml_base_send fails
- minor. descriptor leaked in mca_pml_send_request_start_copy if the btl returns OMPI_ERR_RESOURCE_BUSY.
This commit was SVN r26077.
* fixed some bugs where "unknown" tokens were allowed on the command
line (which should really only be used for ortertun).
* if an unknown token is encountered, print a short error to stderr
and quit with a nonzero exit status
* if we don't find the right number of parameters to an option, print
a short error to stderr and quit with a nonzero exit status
* when --help is given, print the help message to stdout (not stderr)
and quit with a zero exit status
* added --showme:help option to the wrapper compilers
* updated docs in opal/util/cmd_line.h
* other small/miscellaneous CLI parsing bugs in various tools
I won't bore you with what we did before. :-) Here's some examples
of what the new behavior looks like:
{{{
% ompi_info --bogus
ompi_info: Error: unknown option "--bogus"
Type 'ompi_info --help' for usage.
% ompi_info --param bogus
ompi_info: Error: option "--param" did not have enough parameters (2)
Type 'ompi_info --help' for usage.
%
}}}
This commit was SVN r26072.
logic back (that was replaced by r25965 and r26000) and fix the one
place that missed OMPI_LOGICAL_2_INT. This missing OMPI_LOGICAL_2_INT
was the real problem.
This commit was SVN r26053.
The following SVN revision numbers were found above:
r25965 --> open-mpi/ompi@b10ebf4b2d
r26000 --> open-mpi/ompi@90811cb50c
- otfmerge-mpi:
- use the MPI-2 versions of MPI_Address and MPI_Type_struct
Changes to VT:
- VT libs:
- suppress warnings about usage of deprecated MPI functions (OMPI_WANT_MPI_INTERFACE_WARNING=0)
- vtunify-mpi:
- use the MPI-2 versions of MPI_Address and MPI_Type_struct
- removed unused MPI wrappers
This commit was SVN r26051.
Uses new CUDA IPC support. Also, a few minor changes in PML to take
advantage of it.
This code has no effect unless user asks for it explicitly via
configure arguments. Otherwise, it is either #ifdef'ed out or
not compiled.
This commit was SVN r26039.
defensive about the check of the flag value for the C-based keyvals.
We would never have had a problem because of the specific input data,
but being defensive is good (and it makes the code a little less
subtle / easier to read).
Also add in more comments about exactly what is going on, since this
is complicated stuff. :-)
This commit was SVN r26000.
MPI started!).
The FLAG argument to fortran attribute copy functions is a LOGICAL,
meaning that it can only return .TRUE. or .FALSE. The corresponding C
argument is an int, and the MPI spec says that it must return 1 or 0.
However, in Fortran, .TRUE. is not always necessarily == 1. So we
need to expand the test to see if it's a Fortran callback. If so,
check for the Fortran .TRUE. value (not 1). If it's a C callback,
then check for 1.
This commit was SVN r25965.
1. no binding support - indicated by a negative return code from get_cpubind
2. binding supported, but not bound - the bitset returned by get_cpubind is the same as the available cpuset
3. binding supported and bound - bitset from get_cpubind is a subset of available cpuset
4. only one cpu is available - in this case, get_cpubind matches the available cpuset, but we are effectively bound
This commit was SVN r25957.
patch from the ticket, released under the BSD license.
This commit was SVN r25949.
The following Trac tickets were found above:
Ticket 2933 --> https://svn.open-mpi.org/trac/ompi/ticket/2933
- re-enable sendi
- move smsg common code into btl_ugni_smsg.h
- added new parameters for smsg/eager frags
- use get for frags larger than the smsg_limit
- bug fixes
- code cleanup
This commit was SVN r25897.
Adds a lock to protect the sm pending_sends list from concurrent access
Fixes bug where btl_sm_process_pending_sends would return an item to
the free list and then continue to use it for a little while
cmr:v1.6
This commit was SVN r25878.
The following Trac tickets were found above:
Ticket 2998 --> https://svn.open-mpi.org/trac/ompi/ticket/2998
definitely should not be linking to more than libmpi.la! (remember
that libmpi.la now wholly contains libopen-rte.la, which wholly
contains libopen-pal.la).
This commit was SVN r25843.
of the group argument to MPI_COMM_CREATE.
cmr:v1.5:reviewer=jjhursey
cmr:v1.4.5:reviewer=jjhursey
This commit was SVN r25810.
The following Trac tickets were found above:
Ticket 2967 --> https://svn.open-mpi.org/trac/ompi/ticket/2967
- tests/thumbnail:
- removed unnecessary header include (stdbool.h) that breaks build on Solaris
Changes to VT:
- configure:
- fixed detection of Open64 compilers for automatic instrumentation
- VT libs:
- fixed non-increasing timestamps when flushing the trace buffer: check trace status after calling vt_update_counter() to prevent function leave events from recording, if maximum buffer flushes are reached
- calculate fixed record lengths only once when creating new buffer entries
- vtunify[-mpi]:
- minor code-optimization: use ++it instead of it++ in for-loops to prevent unnecessary copying of objects
This commit was SVN r25674.
- Use own implementation of assert() to work around a compiler bug (seen on MacOS using GCC v4.2.1):
The linker results in an undefined reference to ___builtin_expect() when using assert() within OpenMP-parallel regions.
This commit was SVN r25595.
- fixed a bug (potential segfault) in the MPI wrapper functions MPI_Gatherv and MPI_Scatterv which occurred due to illegal access to insignificant parameters on non-root ranks
- vtdyn:
- stop instrumenting if an error occurred during finalizing instrumentation set
- vtunify-mpi:
- added option '--stats' to unify only summarized information, no events
- reduced memory usage on rank 0: immediately send token translation tables to the corresponding worker ranks when they are complete
- send the "finished-flag" together with the last set of definitions read to rank 0 instead of sending an extra message
- OPARI:
- fixed detection of DO loop beginnings; If there is a variable which contains "do" in its name it was detected as DO loop :-(
- fixed processing of Fortran line-continuation appearing after a complete OpenMP directive
This commit was SVN r25584.
So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages.
This commit was SVN r25562.
Turns out, this isn't necessarily true. The Cray, for example, launches processes in a toroidal pattern, thus causing the daemons to wind up somewhere other than what we thought. Other environments (e.g., slurm) are also capable of such behavior, depending upon the default mapping algorithm they are told to use.
Resolve this problem by making the daemon-to-node assignment in the affected environments when the daemon calls back and tells us what node it is on. Order the nodes in the mapping list so they are in daemon-vpid order as opposed to the order in which they show in the allocation. For environments that don't exhibit this mapping behavior (e.g., rsh), this won't have any impact.
Also, clean up the vm launch procedure a little bit so it more closely aligns with the state machine implementation that is coming, and remove some lingering "slave" code.
This commit was SVN r25551.
Per http://www.open-mpi.org/community/lists/users/2011/11/17862.php,
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:
-Wl,-commons,use_dylibs
So if we're compiling on OS X, test to see if that flag works with the
compiler. If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).
Fixes trac:1982.
This commit was SVN r25547.
The following SVN revision numbers were found above:
r25545 --> open-mpi/ompi@7f9ae11faf
The following Trac tickets were found above:
Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
supposed to. I.e., half-baked/not complete stuff.
This commit backs out all of r25545. Sorry folks!
This commit was SVN r25546.
The following SVN revision numbers were found above:
r25545 --> open-mpi/ompi@7f9ae11faf
to make MPI_IN_PLACE (and other sentinel Fortran constants) work on OS
X, we need to use the following compiler (linker) flag:
-Wl,-commons,use_dylibs
So if we're compiling on OS X, test to see if that flag works with the
compiler. If so, add it to the wrapper FFLAGS and FCFLAGS (note that
per a future update, we'll only have one Fortran compiler anyway).
Fixes trac:1982.
This commit was SVN r25545.
The following Trac tickets were found above:
Ticket 1982 --> https://svn.open-mpi.org/trac/ompi/ticket/1982
- otfprofile[-mpi]:
- fixed compile error with the PGI compiler
Changes to VT:
- added support for LIBC [I/O] tracing on Cray XT platforms
- vtrun:
- do preload Dyninst runtime library (DYNINSTAPI_RT_LIB) when
instrumenting user functions by Dyninst
This commit was SVN r25505.
Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations.
Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way.
This commit was SVN r25497.
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement
The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.
In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:
1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.
2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.
3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.
As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.
This commit was SVN r25476.
Modify the configure logic and the PMI components to accommodate Cray's approach. Refactor the PMI error reporting code so it resides in only one place. Cray actually decided -not- to define the PMI-2 error codes, so we have to use the PMI-1 codes instead. More fun.
This commit was SVN r25348.
Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times.
Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h
Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun.
This commit was SVN r25331.
zeroes);
if so, use it for bit-operations like opal_cube_dim and opal_hibit.
Implement two versions of power-of-two.
In case of opal_next_poweroftwo, this reduces the average execution
time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
measured rdtsc, with loop over 2^27 values).
Numbers for other functions are similar (but of course heavily depend
on the usage, e.g. opal_hibit() with a start of 4 does not save
much). The bsr instruction on AMD Opteron is also not as fast.
- Replace various places where the next power-of-two is computed.
Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.
This commit was SVN r25270.
If a user specifically asks for rdmacm support in configure script and
librdmacm (usual and devel) libraries are not found, configure script
would abort.
If a user didn't specify anything, and rdmacm libraries are not found,
configure script will continue after issuing warning message:
"Please install librdmacm and librdmacm-devel or disable rdmacm support"
-- YK
This commit was SVN r25253.
* Only print returnable errors when verbose=1. Still print errors when
we're going to abort, since those obviously aren't returnable
This commit was SVN r25213.
* hdr_data now includes opcount and length for all messages, which is the match
bits for long and rndv messages
* Re-add probe implementation
This commit was SVN r25207.
- add 2 new device ids.
- default rq depth to 64, which proved good for large runs.
This commit should be added to cmr:v1.4:reviewer=jsquyres and
cmr:v1.5:reviewer=jsquyres
This commit was SVN r25145.
Global rdmacm_resolve_timeout defaults to 1000 (1000 ms), which is way
too small for even a 16 node x 12 core iwarp cluster in the presence
of drops. Bump up the default to 30000ms.
This commit fixes trac:2860 and should be added to cmr:v1.4:reviewer=jsquyres
and cmr:v1.5:reviewer=jsquyres
This commit was SVN r25144.
The following Trac tickets were found above:
Ticket 2860 --> https://svn.open-mpi.org/trac/ompi/ticket/2860
the command line, hwloc is just like any other external dependency
in OMPI: if we find it, we'll use it. If we don't find it, we'll
ignore it. See comments in opal/mca/hwloc/configure.m4 for an
explanation.
* Fix some copy-n-paste errors in opal/mca/hwloc/configure.m4
w.r.t. flags coming in from the winning component.
* Add another line in ompi_info's output about whether hwloc support
is included or not.
This commit was SVN r25134.
- slight change in the selection logic of the fs module, which makes
the ompio independent of the file system type (otherwise ompio
would also have required a configure script).
This commit was SVN r25118.
Don't juse include pre-processor macros between two strins ("s1" #if 0 ... "s2")...
Rather print out the epoch as 0 always...
This commit was SVN r25110.
- configure:
- patch Makefiles which define library targets that depend on other libraries to prevent the following Libtool warning:
"libtool: link: warning: `...//*.la' seems to be moved"
(Libtool getting confused by the "//" in the library paths,
so remove the trailing '/' from all *LIBDIR variables.)
- vtwrapper:
- added options '-vt:showme-<compile|link>' to the compiler wrapper to show the compiler/linker flags that would be supplied to the underlying compiler
This commit was SVN r25105.
To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
libs don't seem to propagate correctly under certain circumstances. This makes
hopefully the nightly tests pass.
also, remove the files that should not have been committed in the first place
:-)
This commit was SVN r25085.
- updated version number to 5.11.2
- fixed even more Coverity warnings
- vtunify:
- replaced std::vector relict by LargeVectorC (fixes segfault during gathering)
- vtwrapper:
- do also escape '\', ''', '(', and ')' in arguments
This commit was SVN r25071.
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.
This commit does a few things:
* Introduce a new OPAL MCA base function:
mca_base_param_check_exclusive_string(). It checks to see that the
''user'' does not set two MCA parameters that are mutually
exclusive by checking the source of those MCS param values.
* Use the above function in many BTLs (and the OOB TCP) to ensure
that <foo>_if_include and <foo>_if_exclude are not both specified
''by the user''.
* Re-arrange many of these BTLs to move their MCA registration code
into a separate component_register() function (vs. the
component_open() function).
This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.
This commit was SVN r25043.
The following SVN revision numbers were found above:
r24976 --> open-mpi/ompi@8f4ac54336
- always check the result of OTF_WStream_get*Buffer since it might be NULL in case OTF_File_open fails
Changes to VT:
- CUDA Tracing:
- fixed configure stack for filtered kernels
- fixed buffer size for CUPTI tracing
- replaced error message with warning to continue tracing, even if CUDA error occured (VTCUDAsynchronizeEvt)
- vtunify:
- enlarged minimum message size for transfering local definitions to rank 0
- use binary search for searching already created global definitions
- use binary search for searching already created global marker definitions
- use LargeVectorC instead of std::vector for pre-allocating elements
- vtwrapper:
- added options '-vt:CC' and '-vt:c++' which are synonyms for '-vt:cxx'
This commit was SVN r24997.
- Added dynamic SL support to xoob
- Fixed seg fault in finalization
- All the code has been moved to separate files: connect/btl_openib_connect_sl.{c,h}
- The new files compilation is conditionalized
This commit was SVN r24991.
comment out some unused parameter names. I didn't use
__opal_attribute_unused__ because comm_inln.h is (eventually) included
by <mpi.h>, and therefore we don't have all the OPAL config stuff
available. And it didn't seem worth it to add the optional
attribute_unused stuff to the top of mpi.h.
Thanks to Júlio Hoffimann for reporting the issue.
This commit was SVN r24989.
btl_tcp_if_include and btl_tcp_if_exclude are specified.
This commit was SVN r24976.
The following Trac tickets were found above:
Ticket 2838 --> https://svn.open-mpi.org/trac/ompi/ticket/2838
still unstable. Reverted errmgr modules back to the original errmgr (with the
updates since the resilient code was brought into the trunk).
This commit was SVN r24958.
- improved zlib compression
- otfprofile-mpi:
- fixed progress
Changes to VT:
- fixed C++ linker issue for manual instrumentation of multiple files
- fixed CUDA kernel launch configuration
- process and thread buffer size can be explicitly specified by the user via the environment variables VT_BUFFER_SIZE and VT_THREAD_BUFFER_SIZE
- fixed CUDA buffer management
- vtfilter:
- fixed progress
- vtwrapper:
- link CUPTI library, if available
- vtsetup:
- removed fixed path to *.dtd file in vtsetup-data.xml[.in] (fixes 'java.net.MalformedURLException')
This commit was SVN r24950.
''other'' direction, so to speak, compared to r24921).
This commit was SVN r24924.
The following SVN revision numbers were found above:
r24921 --> open-mpi/ompi@bd96d028de
they only add the context id to the tag selection of the underlying
messaging meachinsm.
We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager
messages and rendezvous requests per-communicator basis.
The MTL will be allowed to override comm->c_pml_comm member,
since it's unused in pml_cm anyway.
This commit was SVN r24858.
was explicitly requested. If it was, but opensm-devel
package is not found, warn and abort.
Otherwise, doing the best effort: if opensm-devel found,
enable dynamic SL. If it's not found, disable dynamic
SL and build OMPI w/o it.
This commit was SVN r24852.
otfprofile-mpi:
- added progress display
- added verbose messages
- added functions to sychronize the error indicator to all worker ranks
(enforces that all ranks will be terminated by calling MPI_Abort if anyone fails)
- wrap def. comments after 80 characters
- use pdf[la]tex instead of latex/dvipdf to convert TeX output to PDF
- added configure checks for pdf[la]tex and PGFPLOTS v1.4
- fixed function invocation statistics generated from summarized information (--stat)
- fixed memory leak
Changes to VT:
MPI wrappers:
- fixed wrapper generation for MPI implementations which don't support the MPI-2 standard (e.g. MVAPICH, MPICH)
- corrected IN_PLACE denotation for MPI_Alltoall* and MPI_Scatter*
vtwrapper:
- corrected detection of IBM XL's OpenMP flag -qsmp=*:omp:*
vtunify:
- fixed faulty cleanup of temporary files which occurred if VT is configured without trace compression support
This commit was SVN r24851.
request completion callback
* Use the completion callback pointer to remove all need for opal_progress
calls in the one-sided layer
This commit was SVN r24848.
- Added enable/disable configuration parameter for dynamic SL
- All the dynamic SL code is conditionalized
- Removed libibmad dependency
- Using only one include - ib_types.h (part of opensm-devel package)
- Removed all the macro and data types definitions, using the
existing definitions from ib_types.h instead
- general cleaning here and there
The async mode is not implemented yet - stay tuned...
This commit was SVN r24830.
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.
This commit was SVN r24829.