Per RFC. There are two optimizations in this commit:
- Allocate requests for blocking sends and receives on the stack. This
bypasses the request free list and saves two atomics on the critical path.
This change improves the small message ping-pong by 50-200ns on both AMD
and Intel CPUs.
- For small messages try to use the btl sendi function before intializing a
send request. If the sendi fails or the btl does not have a sendi function
silently fallback on the standard send path.
cmr=v1.7.5:reviewer=brbarret
This commit was SVN r30343.
Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper)
Fixed by Val, Reviewed by Devendar/Josh/Miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30341.
This commit fixes an error path that occurs when huge page allocations are
enabled. In this case we allocate a huge page and try to register it but fail.
We then were calling free on the opal object. Fix this by calling the proper free
function.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30289.
Also add a verbose flag so one can see what devices are selected as well as another flag to override
locality information and use all devices on the node.
This commit was SVN r30287.
NetBSD puts the AIO functions in -lrt, vs. the usual libc. So we
need the fbtl/posix configure.m4 to test for -lrt properly.
Reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Fix NetBSD use of -laio
This commit was SVN r30274.
NOTE: launch performance will be absolutely awful if you do this with BTLs that aren't configured to modex_recv on first message!
Even with "modex on demand", we still have to do a barrier in place of the modex - we simply don't move any data around, which does reduce the time impact. The barrier is required to ensure that the other proc has in fact registered all its BTL info and therefore is prepared to hand over a complete data package. Otherwise, you may not get the info you need. In addition, the shared memory BTL can fail to properly rendezvous as it expects the barrier to be in place.
This behavior will *only* take effect under the following conditions:
1. launched via mpirun
2. #procs is greater than ompi_hostname_cutoff, which defaults to UINT32_MAX
3. mca param rte_orte_direct_modex is set to 1. At the moment, we are having problems getting this param to register properly, so only the first two conditions are in effect. Still, the bottom line is you have to *want* this behavior to get it.
The planned next evolution of this will be to make the direct modex be non-blocking - this will require two fixes:
1. if the remote proc doesn't have the required info, then let it delay its response until it does. This means we need a way for the MPI layer to tell the RTE "I am done entering modex data".
2. adjust the SM rendezvous logic to loop until the required file has been created
Creating a placeholder to bring this over to 1.7.5 when ready.
cmr=v1.7.5:reviewer=hjelmn:subject=Enable direct modex at scale
This commit was SVN r30259.
intended to be used and emits a compile-time warning.
Thanks to Paul Hargrove for identifying the issue.
cmr=v1.7.4:reviewer=hjelmn:subject=remove/replace malloc.h
This commit was SVN r30231.
Avoid compiler warning about (unnecessarily) initializing 2 variables
during instantiation at the top of a switch block (but outside of any
case statements): just declare the variables at the top of the outter
block. They're already safely initialized, so don't worry about
initializing them in the instantiation.
Reviewed by Dave Goodell.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Don't instantiate+init variables in a switch block
This commit was SVN r30228.
ob1 dummy registration to actually be used when using udreg. Fix this by
always setting reg to NULL when mpool/udreg's register function fails.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30214.
Set comm attribute with keyval.
Wait for pending hcoll module tasks in comm delete callback where PML
still valid on the communicator. safely destroy hcoll context during
hcoll module destructor.
Author: Devendar Bureddy
reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30175.
needed for correctness. The if_include/if_exclude are level 1, and
the TCP port range params are level 2; this parameter seems to be on
par with the TCP port range params.
Refs trac:4019
This commit was SVN r30161.
The following Trac tickets were found above:
Ticket 4019 --> https://svn.open-mpi.org/trac/ompi/ticket/4019
* Remove some set-but-not-used variables
* Make a convenience function return void (we weren't using the
return code, anyway)
* Mark a function as inline (it was supposed to be inline anyway)
Reviewed by Dave Goodell.
cmr=v1.7.5:reviewer=ompi-rm1.7:subject=Fix usnic BTL compiler warnings
This commit was SVN r30160.
Thanks to Tetsuya Mishima for detecting it!
cmr=v1.7.4:reviewer=jsquyres:subject=Correct tcp_not_use_nodelay option processing
This commit was SVN r30157.
- HCOLL close without init
- Call hcoll progress after comm finalize
- mpirun default for coll_hcoll_enable is 1
fixed by Igor, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30156.
configury/Makefile.am changes; this commit renames the internal
installdirs.h framework struct field names to match the configry macro
names:
* pkgdatdir -> ompidatadir
* pkglibdir -> ompilibdir
* pkgincludedir -> ompiincludedir
This commit was SVN r30145.
The following SVN revision numbers were found above:
r30140 --> open-mpi/ompi@8b778903d8
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
Complements r30073: tighten up the string parsing of the vendor parts
ID MCA param a bit. Also fix a small memory leak: ensure to free the
array uint32_t's parsed out of the MCA param.
This commit was SVN r30128.
The following SVN revision numbers were found above:
r30073 --> open-mpi/ompi@6003702a51
The following Trac tickets were found above:
Ticket 4301 --> https://svn.open-mpi.org/trac/ompi/ticket/4301
This commit adds support for placing the send memory segment in a
traditional shared memory segment when XPMEM is not available. The
current default is to reserve 4MB for shared memory on each process.
The latest benchmarks show vader performing better than sm on both
Intel and AMD CPUs.
For large messages vader will now use CMA if it is available (and
XPMEM is not).
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r30123.