Fix a few bugs in the mappers:
1. Ensure that bynode with no -np fills all available slots - it just does so with the ranks set bynode instead of byslot
2. fix --nolocal behavior so it works correctly in all cases. We still have to test the host's name using opal_ifislocal in the mapper because the name returned by gethostname to orte_process_info.hostname can be an FQDN, but a hostfile may contain a non-FQDN version.
3. Add missing --nolocal logic to the seq mapper
Oversubscribed mapping seemed to be working okay without repair, so I couldn't verify my own bug report in that regard.
Also included are some preliminary changes to support the modified hostfile behavior, which will be committed shortly:
1. removed the totally useless "allocate" field in the orte_node_t object since every node is automatically allocated for use - and everything ignored the field anyway
2. correctly initialize the slots_alloc field when the allocation is read
This commit was SVN r19030.
* Use synonym/deprecated MCA param API for some mca base params
* In openib BTL, if we have appropriate memory hooks support, and if
mpi_leave_pinned and mpi_leave_pinned_pipeline were not set by the
user, set mpi_leave_pinned to 1.
* Defer checking mpi_leave_pinned_* until as late as possible (i.e.,
until after the btl's have had a chance to set mpi_leave_pinned to
1):
* in ob1 pml
* in rdma mpool
This commit was SVN r19022.
The following Trac tickets were found above:
Ticket 1379 --> https://svn.open-mpi.org/trac/ompi/ticket/1379
Kinda found by MTT. MTT calls 'ompi_info --all --parsable' and it was livelocked and had to be killed by hand.
I'm going to push this one to Jeff to push to v1.3 since he did the original implementation and should check this code.
This commit was SVN r19014.
a little bit more than "BTL was able to add some procs". The real condition to
allow the BTL progress is that we will use it to send/recv data to/from some
of the peers (this include the BTL exclusivity in the process).
This commit was SVN r19010.
Modify the odls to remove a (size_t) typecast in front of the num_processors variable just in case it is returned negative. This usually is accompanied by an opal_error, so this shouldn't make any difference - but it is more technically correct.
This commit was SVN r19008.
Fixed allocation of all ranks when using RANKFILE, but not all ranks assigned
Aborting if using RANKFILE, but np wasn't specified a little earlier
Clean mca_rmaps_rank_file_component.debug
This commit was SVN r19004.
* Fix linux paffinity component to make a "best" guess when PLPA
can't find topology information in the Linux kernel. That is, if
PLPA can't tell us the max_processor_id, just assume that it's the
same as the number of processors. If you have a more complex
system than that (e.g., you have holes in your available processor
IDs), you'll likely be running a Linux kernel that supports the
topology information, and this problem won't happen.
* Make sure to conver the return codes from PLPA to OPAL_ERR* codes.
This commit was SVN r19001.
The following Trac tickets were found above:
Ticket 1250 --> https://svn.open-mpi.org/trac/ompi/ticket/1250
For now, hide the OSX component with .ompi_ignore so only I can see it until I can ensure that it doesn't inadvertently interfere with Linux and Solaris support.
This clears the conflict with Windows.
This commit was SVN r18989.
"!OpenFabrics" / neutral (i.e., refer to IB and/or iWARP).
* Mostly just type, variable/field, and funcion name changes, such as
s/hca/device/g, etc.
* Changed the INI file for the hardware-specific parameters to be
mca-btl-openib-device-params.ini.
* Updated a lot of help messages in the help-*.txt files, not just to
update it to be !OpenFabrics/neutral language, but also for some
consistency of tone, indenting, etc.
* Deprecated a bunch of MCA params in favor of language-neutral new
ones:
* btl_openib_warn_no_hca_params_found (s/hca/device/)
* btl_openib_hca_param_files
* btl_openib_ib_cq_size (s/_ib_/_of_/)
* btl_openib_ib_max_inline_data
* btl_openib_ib_psn
* btl_openib_ib_mtu
* btl_openib_ib_pkey_ix
* btl_openib_ib_pkey_val
This commit was SVN r18985.
The following Trac tickets were found above:
Ticket 1295 --> https://svn.open-mpi.org/trac/ompi/ticket/1295
The rdmacm event handler has no way of reporting fatal errors to the upper
layers. By calling mca_btl_openib_endpoint_invoke_error in the rdmacm event
handler for the errors encountered, these errors can now be handled
appropriately.
Closes out Ticket #1283
This commit was SVN r18980.
--debug flag to help developers figure out possible future issues.
This fixes trac:1335.
This commit was SVN r18979.
The following Trac tickets were found above:
Ticket 1335 --> https://svn.open-mpi.org/trac/ompi/ticket/1335
1. add a new API delete_route(orte_process_name_t*) to delete the specified proc from the routing table
2. modify update_route so that it actually updates pre-existing routes instead of only adding routing info the end of the hash table
This fixes ticket #1403
This commit was SVN r18970.
can have a pub_endpoint and a sub_endpoint that are not equal but go
to the same place (fd). I didn't think that that was possible. :-\
So just use a bool to track whether we have forwarded the fragment at
all; if we have, then don't forward to the sub_endpoint.
IOF is going to be re-written for v1.4.
This commit was SVN r18950.
The following SVN revision numbers were found above:
r18873 --> open-mpi/ompi@773c92a6eb
environment, file, or API override).
Refs trac:1397
This commit was SVN r18943.
The following Trac tickets were found above:
Ticket 1397 --> https://svn.open-mpi.org/trac/ompi/ticket/1397