When using an external libfabric (or really any libfabric newer than
libfabric commit 607e863), we must use fi_getname to determine the local
port of our endpoint. Without this fix, OMPI will hang endlessly
while retransmitting packets to port 0 on the remote host.
Code for setting proc node locality
was absent after the removal of Cray
PMI KVS usage. This commit puts that
functionality back in place.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.
Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
Turns out that this is just copy-n-pasted code from OMPI. To be
clear: there's no need for the oshmem layer to instantiate sentinels
like mpi_fortran_bottom.
Thanks @jsquyres for pointing this.
It is perfectly ok to be on a system without UD devices.
Also, make some of the error messages better -- so that the user has a
clue about where the error messages are coming from, and what they
should do.
Only install the fake usnic libibverbs driver when there are actually
usnic kernel devices present. This prevents some run-time weirdness
on the Cray verbs emulation environment, where apparently
ibv_register_driver() either is not implemented or does not work
properly.
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created. Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK). This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
There was a redundant computation of the vpid
for orted's happening in ess/alps rte_init
method. Keep the more efficient alps based
method.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
The renamed variables used the same identifiers as variables in
errcode.c. To avoid confusion rename the variables to end in _intern.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>