Issue noted by Walter Spector on the user's mailing list.
Throwing to Craig Rasmussen for review.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31933.
This would be a really, really weird case if it ever happens (i.e.,
you have usnics but the agent process failed somewhere in MPI_INIT
such that the agent never appears), but having an infinite loop
doesn't seem like a good idea.
(does not need to go to v1.8 because v1.8 still uses RML for
communication for the connectivity checker)
This commit was SVN r31932.
This conservative fixes tries to fetch info from both
opal_dstore_nonpeer and opal_dstore_peer.
This is required is task A spawns tasks B and C.
B was previously unable to find info from C, this caused locality
info not being set and a hang in coll/ml init.
no CMR is required since v1.8 uses a unique dstore
This commit was SVN r31923.
if eager rdma is used, endpoint reference_count is greater than one.
this commit is a temporary fix that OBJ_RELEASE the endpoint as much as needed.
thought this is likely correct, it can be suboptimal and hence needs to be reviewed
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r31922.
http://www.open-mpi.org/community/lists/devel/2014/05/14822.php
Revamp the ORTE global data structures to reduce memory footprint and add new features. Add ability to control/set cpu frequency, though this can only be done if the sys admin has setup the system to support it (or you run as root).
This commit was SVN r31916.
during the unpack and during the positioning.
Fixes trac:4610.
This commit was SVN r31904.
The following Trac tickets were found above:
Ticket 4610 --> https://svn.open-mpi.org/trac/ompi/ticket/4610
We were still leaking 1) file descriptors for data files, and 2) some
control files. I fixed both of these leaks and everything is looking
good. This should fix the bug where we are running out of file
descriptors when running the loop_spawn test. I also too the
opportunity to refactor the code a bit to make the mapping/unmapping
simpler. This should help avoid these sorts of issues in the future.
Depends on #4678
cmr=v1.8.2:reviewer=manjugv
This commit was SVN r31893.
if in_ptr is NULL, the MAP_FIXED flag cannot be passed to mmap
this caused a hang in topology/cart and topology/sub from ibm
test suite on trunk.
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r31890.
Thanks George for pointing out.
cmr=v1.8.2:reviewer=bosilca:ticket=4676
This commit was SVN r31889.
The following Trac tickets were found above:
Ticket 4676 --> https://svn.open-mpi.org/trac/ompi/ticket/4676
This fixes a bug introduced in :
- r31815 (trunk)
- r31853 (v1.8 branch)
cmr=v1.8.2:reviewer=bosilca
This commit was SVN r31888.
The following SVN revision numbers were found above:
r31815 --> open-mpi/ompi@8bafe06c57
r31853 --> open-mpi/ompi@bff944d766
Per Ralph :
"I noticed that we are incrementing and decrementing the opal_progress_event state.
However, this no longer has any impact whatsoever on the RML as that is running in
the independent ORTE event thread. So all this actually does is impact the MPI layer
by adding an unnecessary overhead."
Thanks Ralph for pointing this :-)
cmr=v1.8.2:reviewer=rhc:ticket=4671
This commit was SVN r31887.
The following Trac tickets were found above:
Ticket 4671 --> https://svn.open-mpi.org/trac/ompi/ticket/4671
since r31716 mca_topo_base_comm_cart_2_2_0_t is an object
and must be allocated/freed with OBJ_NEW/OBJ_RELEASE.
this fixes topology/cart_sub_zero from the ibm test suite.
v1.8 does not use objects, so no cmr for this branch
This commit was SVN r31883.
The following SVN revision numbers were found above:
r31716 --> open-mpi/ompi@e3df77548d
This commit is a slightly better workaround to prevent mesages of
the form:
[unset]:_pmi_alps_get_apid:alps_app_lli_put_request failed
[unset]:_pmi_alps_get_appLayout:pmi_alps_get_apid returned with error: Bad file descriptor
It works by completely disabling PMI in the application process when using
mpirun. This should not be an issue for any apps.
cmr=v1.8.2:reviewer=rhc
This commit was SVN r31882.