- otfprofile[-mpi]:
- fixed compile error with the PGI compiler
Changes to VT:
- added support for LIBC [I/O] tracing on Cray XT platforms
- vtrun:
- do preload Dyninst runtime library (DYNINSTAPI_RT_LIB) when
instrumenting user functions by Dyninst
This commit was SVN r25505.
Brian dealt with this in the past by creating platform files and using "no-build" to block the components. This was clunky, but acceptable when only one organization was using that option. However, that number has now expanded to at least two more locations.
Accordingly, make --without-rte-support actually work by adding appropriate configury to prevent components from building when they shouldn't. While doing so, remove two frameworks (db and rmcast) that are no longer used as ORCM comes to a close (besides, they belonged in ORCM now anyway). Do some minor cleanups along the way.
This commit was SVN r25497.
https://svn.open-mpi.org/trac/ompi/wiki/ProcessPlacement
The wiki page is incomplete at the moment, but I hope to complete it over the next few days. I will provide updates on the devel list. As the wiki page states, the default and most commonly used options remain unchanged (except as noted below). New, esoteric and complex options have been added, but unless you are a true masochist, you are unlikely to use many of them beyond perhaps an initial curiosity-motivated experimentation.
In a nutshell, this commit revamps the map/rank/bind procedure to take into account topology info on the compute nodes. I have, for the most part, preserved the default behaviors, with three notable exceptions:
1. I have at long last bowed my head in submission to the system admin's of managed clusters. For years, they have complained about our default of allowing users to oversubscribe nodes - i.e., to run more processes on a node than allocated slots. Accordingly, I have modified the default behavior: if you are running off of hostfile/dash-host allocated nodes, then the default is to allow oversubscription. If you are running off of RM-allocated nodes, then the default is to NOT allow oversubscription. Flags to override these behaviors are provided, so this only affects the default behavior.
2. both cpus/rank and stride have been removed. The latter was demanded by those who didn't understand the purpose behind it - and I agreed as the users who requested it are no longer using it. The former was removed temporarily pending implementation.
3. vm launch is now the sole method for starting OMPI. It was just too darned hard to maintain multiple launch procedures - maybe someday, provided someone can demonstrate a reason to do so.
As Jeff stated, it is impossible to fully test a change of this size. I have tested it on Linux and Mac, covering all the default and simple options, singletons, and comm_spawn. That said, I'm sure others will find problems, so I'll be watching MTT results until this stabilizes.
This commit was SVN r25476.
Modify the configure logic and the PMI components to accommodate Cray's approach. Refactor the PMI error reporting code so it resides in only one place. Cray actually decided -not- to define the PMI-2 error codes, so we have to use the PMI-1 codes instead. More fun.
This commit was SVN r25348.
Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times.
Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h
Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun.
This commit was SVN r25331.
zeroes);
if so, use it for bit-operations like opal_cube_dim and opal_hibit.
Implement two versions of power-of-two.
In case of opal_next_poweroftwo, this reduces the average execution
time from 83 cycles to 4 cycles (Intel Nehalem, icc, -O2, inlining,
measured rdtsc, with loop over 2^27 values).
Numbers for other functions are similar (but of course heavily depend
on the usage, e.g. opal_hibit() with a start of 4 does not save
much). The bsr instruction on AMD Opteron is also not as fast.
- Replace various places where the next power-of-two is computed.
Tested on Intel Nehalem Cluster with openib, compilers GNU-4.6.1 and
Intel-12.0.4 using mpi_testsuite -t "Collective" with 128 processes.
This commit was SVN r25270.
If a user specifically asks for rdmacm support in configure script and
librdmacm (usual and devel) libraries are not found, configure script
would abort.
If a user didn't specify anything, and rdmacm libraries are not found,
configure script will continue after issuing warning message:
"Please install librdmacm and librdmacm-devel or disable rdmacm support"
-- YK
This commit was SVN r25253.
* Only print returnable errors when verbose=1. Still print errors when
we're going to abort, since those obviously aren't returnable
This commit was SVN r25213.
* hdr_data now includes opcount and length for all messages, which is the match
bits for long and rndv messages
* Re-add probe implementation
This commit was SVN r25207.
- add 2 new device ids.
- default rq depth to 64, which proved good for large runs.
This commit should be added to cmr:v1.4:reviewer=jsquyres and
cmr:v1.5:reviewer=jsquyres
This commit was SVN r25145.
Global rdmacm_resolve_timeout defaults to 1000 (1000 ms), which is way
too small for even a 16 node x 12 core iwarp cluster in the presence
of drops. Bump up the default to 30000ms.
This commit fixes trac:2860 and should be added to cmr:v1.4:reviewer=jsquyres
and cmr:v1.5:reviewer=jsquyres
This commit was SVN r25144.
The following Trac tickets were found above:
Ticket 2860 --> https://svn.open-mpi.org/trac/ompi/ticket/2860
the command line, hwloc is just like any other external dependency
in OMPI: if we find it, we'll use it. If we don't find it, we'll
ignore it. See comments in opal/mca/hwloc/configure.m4 for an
explanation.
* Fix some copy-n-paste errors in opal/mca/hwloc/configure.m4
w.r.t. flags coming in from the winning component.
* Add another line in ompi_info's output about whether hwloc support
is included or not.
This commit was SVN r25134.
- slight change in the selection logic of the fs module, which makes
the ompio independent of the file system type (otherwise ompio
would also have required a configure script).
This commit was SVN r25118.
Don't juse include pre-processor macros between two strins ("s1" #if 0 ... "s2")...
Rather print out the epoch as 0 always...
This commit was SVN r25110.
- configure:
- patch Makefiles which define library targets that depend on other libraries to prevent the following Libtool warning:
"libtool: link: warning: `...//*.la' seems to be moved"
(Libtool getting confused by the "//" in the library paths,
so remove the trailing '/' from all *LIBDIR variables.)
- vtwrapper:
- added options '-vt:showme-<compile|link>' to the compiler wrapper to show the compiler/linker flags that would be supplied to the underlying compiler
This commit was SVN r25105.
To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
libs don't seem to propagate correctly under certain circumstances. This makes
hopefully the nightly tests pass.
also, remove the files that should not have been committed in the first place
:-)
This commit was SVN r25085.
- updated version number to 5.11.2
- fixed even more Coverity warnings
- vtunify:
- replaced std::vector relict by LargeVectorC (fixes segfault during gathering)
- vtwrapper:
- do also escape '\', ''', '(', and ')' in arguments
This commit was SVN r25071.
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.
This commit does a few things:
* Introduce a new OPAL MCA base function:
mca_base_param_check_exclusive_string(). It checks to see that the
''user'' does not set two MCA parameters that are mutually
exclusive by checking the source of those MCS param values.
* Use the above function in many BTLs (and the OOB TCP) to ensure
that <foo>_if_include and <foo>_if_exclude are not both specified
''by the user''.
* Re-arrange many of these BTLs to move their MCA registration code
into a separate component_register() function (vs. the
component_open() function).
This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.
This commit was SVN r25043.
The following SVN revision numbers were found above:
r24976 --> open-mpi/ompi@8f4ac54336
- always check the result of OTF_WStream_get*Buffer since it might be NULL in case OTF_File_open fails
Changes to VT:
- CUDA Tracing:
- fixed configure stack for filtered kernels
- fixed buffer size for CUPTI tracing
- replaced error message with warning to continue tracing, even if CUDA error occured (VTCUDAsynchronizeEvt)
- vtunify:
- enlarged minimum message size for transfering local definitions to rank 0
- use binary search for searching already created global definitions
- use binary search for searching already created global marker definitions
- use LargeVectorC instead of std::vector for pre-allocating elements
- vtwrapper:
- added options '-vt:CC' and '-vt:c++' which are synonyms for '-vt:cxx'
This commit was SVN r24997.
- Added dynamic SL support to xoob
- Fixed seg fault in finalization
- All the code has been moved to separate files: connect/btl_openib_connect_sl.{c,h}
- The new files compilation is conditionalized
This commit was SVN r24991.
comment out some unused parameter names. I didn't use
__opal_attribute_unused__ because comm_inln.h is (eventually) included
by <mpi.h>, and therefore we don't have all the OPAL config stuff
available. And it didn't seem worth it to add the optional
attribute_unused stuff to the top of mpi.h.
Thanks to Júlio Hoffimann for reporting the issue.
This commit was SVN r24989.
btl_tcp_if_include and btl_tcp_if_exclude are specified.
This commit was SVN r24976.
The following Trac tickets were found above:
Ticket 2838 --> https://svn.open-mpi.org/trac/ompi/ticket/2838
still unstable. Reverted errmgr modules back to the original errmgr (with the
updates since the resilient code was brought into the trunk).
This commit was SVN r24958.
- improved zlib compression
- otfprofile-mpi:
- fixed progress
Changes to VT:
- fixed C++ linker issue for manual instrumentation of multiple files
- fixed CUDA kernel launch configuration
- process and thread buffer size can be explicitly specified by the user via the environment variables VT_BUFFER_SIZE and VT_THREAD_BUFFER_SIZE
- fixed CUDA buffer management
- vtfilter:
- fixed progress
- vtwrapper:
- link CUPTI library, if available
- vtsetup:
- removed fixed path to *.dtd file in vtsetup-data.xml[.in] (fixes 'java.net.MalformedURLException')
This commit was SVN r24950.
''other'' direction, so to speak, compared to r24921).
This commit was SVN r24924.
The following SVN revision numbers were found above:
r24921 --> open-mpi/ompi@bd96d028de
they only add the context id to the tag selection of the underlying
messaging meachinsm.
We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager
messages and rendezvous requests per-communicator basis.
The MTL will be allowed to override comm->c_pml_comm member,
since it's unused in pml_cm anyway.
This commit was SVN r24858.
was explicitly requested. If it was, but opensm-devel
package is not found, warn and abort.
Otherwise, doing the best effort: if opensm-devel found,
enable dynamic SL. If it's not found, disable dynamic
SL and build OMPI w/o it.
This commit was SVN r24852.
otfprofile-mpi:
- added progress display
- added verbose messages
- added functions to sychronize the error indicator to all worker ranks
(enforces that all ranks will be terminated by calling MPI_Abort if anyone fails)
- wrap def. comments after 80 characters
- use pdf[la]tex instead of latex/dvipdf to convert TeX output to PDF
- added configure checks for pdf[la]tex and PGFPLOTS v1.4
- fixed function invocation statistics generated from summarized information (--stat)
- fixed memory leak
Changes to VT:
MPI wrappers:
- fixed wrapper generation for MPI implementations which don't support the MPI-2 standard (e.g. MVAPICH, MPICH)
- corrected IN_PLACE denotation for MPI_Alltoall* and MPI_Scatter*
vtwrapper:
- corrected detection of IBM XL's OpenMP flag -qsmp=*:omp:*
vtunify:
- fixed faulty cleanup of temporary files which occurred if VT is configured without trace compression support
This commit was SVN r24851.
request completion callback
* Use the completion callback pointer to remove all need for opal_progress
calls in the one-sided layer
This commit was SVN r24848.
- Added enable/disable configuration parameter for dynamic SL
- All the dynamic SL code is conditionalized
- Removed libibmad dependency
- Using only one include - ib_types.h (part of opensm-devel package)
- Removed all the macro and data types definitions, using the
existing definitions from ib_types.h instead
- general cleaning here and there
The async mode is not implemented yet - stay tuned...
This commit was SVN r24830.
Everyone will be starting at MIN anyway (until we implement restart of course)
so there's no reason to set the epoch to INVALID and then immediately reset them
to MIN. This way there's less room to make mistakes later.
This commit was SVN r24829.