Per RFC. There are two optimizations in this commit:
- Allocate requests for blocking sends and receives on the stack. This
bypasses the request free list and saves two atomics on the critical path.
This change improves the small message ping-pong by 50-200ns on both AMD
and Intel CPUs.
- For small messages try to use the btl sendi function before intializing a
send request. If the sendi fails or the btl does not have a sendi function
silently fallback on the standard send path.
cmr=v1.7.5:reviewer=brbarret
This commit was SVN r30343.
Gilles Gouaillardet solution attached to ticket #4145.
Closes trac:4145.
cmr=v1.7.4:reviewer=ompi-rm1.7
cmr=v1.6.6:reviewer=ompi-rm1.6
This commit was SVN r30342.
The following Trac tickets were found above:
Ticket 4145 --> https://svn.open-mpi.org/trac/ompi/ticket/4145
Adds coll_hcoll_np mca parameter similar to that of fca component (defaults to 32). Those who use hcoll be aware that from now on the communicators less than 32 procs will run w/o hcoll by default. - Resolves fallback issue in case libhcoll runs out of allowed contexts. The solution is moving hcoll_context_create from comm_enable to comm_query. Shortly, comm_enable should never return OMPI_ERROR in the coll component with highest priority (hcoll). Otherwise the ompi coll_base_select will unselect the coll funtion pointers and module references leaving the communicator w/o coll pointer. This will cause the fail. Same behavior can be reproduced even with tuned if one would hardcore some "return OMPI_ERROR" into it's module_enable funtion. - Additionally, removed all the dead code under #if 0; removed unused variables (path for library, active_modules list) and classes (module list wrapper)
Fixed by Val, Reviewed by Devendar/Josh/Miked
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30341.
The check to enable shmem fortran was too early, MPI can disable fortran but SHMEM fortran check was already done.
Refs trac:3763
This commit was SVN r30340.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
We need to explicitly call mca_base_group_unselect in finalize
for each group that are not freed with oshmem_group_cache_list_free before we unloading scoll framework.
Refs trac:3763
This commit was SVN r30311.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
work around buggy NUMA node cpusets (i.e., buggy BIOSs).
Thanks to Jeff Becker for reporting the issue.
Submitted by Brice Goglin, reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30306.
weren't building the MPI C++ bindings because this causes "CXX=no" to
be passed to the mpicxx wrapper compiler.
Unfortunately, this optimization was brought to the v1.7 branch, and
therefore if you --disable-mpi-cxx on the v1.7 branch, the same
problem will happen. So this needs to be CMR'ed to v1.7 as well.
Submitted by Jeff, reviewed by Brian. RM approved by Brian.
Fixes trac:4120.
cmr=v1.7.4:reviewer=ompi-gk1.7:subject=Fix mpicxx wrapper compiler
This commit was SVN r30302.
The following Trac tickets were found above:
Ticket 4120 --> https://svn.open-mpi.org/trac/ompi/ticket/4120
* Remove an old/outdated bullet about PGI and OS X (that bug has since
been fixed).
cmr=v1.7.4:reviewer=rhc:subject=Update README bullets
This commit was SVN r30290.
This commit fixes an error path that occurs when huge page allocations are
enabled. In this case we allocate a huge page and try to register it but fail.
We then were calling free on the opal object. Fix this by calling the proper free
function.
cmr=v1.7.4:reviewer=rhc
This commit was SVN r30289.
Also add a verbose flag so one can see what devices are selected as well as another flag to override
locality information and use all devices on the node.
This commit was SVN r30287.