Update the comment explaining why we don't build the verbs libfabric
provider (because Open MPI doesn't use it). Specifically: the
upstream libfabric bug about building libfabric under travis has been
fixed -- so that's no longer the reason we don't build it.
NOTE: Building with external pmix *requires* that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated.
Closes#1204 (replaces it)
Fixes#1064
One of the criticisms of Travis on the call earlier today was that
when there's a compile failure, you really only see the *first*
compile failure. Then you fix that, resubmit, and 15-60 minutes
later, you see the *next* compile failure.
Per some discussion with @bosilca and @rhc54 this afternoon, use "make
-k", which will have "make" try to continue the build even after
compile failures. Hence, you'll be able to see many more compile
errors that just one at a time.
In some cases (e.g., if there's a typo in a top-level header file), it
will cause a bazillion lines of compile error output, but hopefully
that should be easy to spot / ignore.
A previous commit updated the one-sided code to register the state
region only once. This created an issue when using the scratch lock
with fetching atomics. In this case on any rank that isn't local rank
0 the module->state_handle is NULL. This commit fixes the issue by
removing the scratch lock and using a fragment pointer instead.
Fixesopen-mpi/ompi#1290
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
It was decided some time ago that there is no benefit to using any
per-peer receive queues on infiniband. At the time we decided not to
change the default but that objection has been dropped. This commit
changes the 128 message queue to use SRQ instead of PP. This has no
impact on iWarp which sets the default in a different way.
Closesopen-mpi/ompi#1156
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
accelerate message queue lookups if the lookups have
the proper tag&mask layout. OpenMPI should follow
PSM2's preferred tag&mask spec, so that PSM2 can provide
a performance benefit.
When PtlMDBind was removed, the result check was left in which
causes intermittent failures depending on the junk value found in
the 'ret' variable. The commit removes the result check.
Due to user confusion, update the show-help messages displayed when
processor and/or memory binding fails. Thanks to Dave Love
(@loveshack) for the initial suggestion.
Fixesopen-mpi/ompi#1087
This script is useful to measure times from launching ompi
application to different internal points. A user can easy add
it`s test basing on existing tests.
See readme information inside the script.
Rename the pmix1xx component to pmix111 so it reflects the actual release it includes
Resolve the problem of PMIx being passed a bogus --with-platform argument when configuring the PMIx tarball code. There is no reason we should be passing --with-platform arguments to any internal subdirectory, so just leave that out when constructing the opal_subdir_args variable.
Update the PMIx code and continue attempting to debug direct modex
Fix a problem in the ORTE PMIx server - there was an early intent to optimize the direct modex by fetching data for all procs from the target job on the remote node, instead of fetching the data one proc at a time. However, this was never completely implemented, and so we would hang if we had multiple overlapping requests for data from more than one proc on the node.
Update PMIx to v1.1.2