This commit removes the use of ompi_group_peer_lookup in the
ompi_dpm_mark_dyncomm function. The function now uses
ompi_group_get_proc_name which does not allocate an ompi_proc_t if one
does not already exist.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Add an accessor for the proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MTL]
member of the ompi_proc_t structure. This accessort calls add_procs
with the ompi_proc_t if the member is NULL.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Add an accessor for the proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MTL]
member of the ompi_proc_t structure. This accessort calls add_procs
with the ompi_proc_t if the member is NULL. Tested on an infinipath
system with InfiniPath_QLE7340 HCAs.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Updated the union/difference code to remove an extra n^2 translation
of ranks. This comes at the cost of extra memory but greatly
simplifies the code.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This commit modifies the ompi_group_t union/difference code to compare/copy the
raw group values. This will either be a ompi_proc_t or a sentinel value. This
commit also adds helper functions to convert between opal process names and
sentinel values.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This commit modifies ompi's process list group object to support a
sentinel value for non-existant ompi_proc_t objects. The sentinel was
chosen to be the negative of the opal_process_name_t of the associated
ompi_proc_t. This takes advantage of the fact that on most (all?)
systems the top bit of a user-space pointer is never set. If this
changes then a new sentinel will be needed.
In addition this commit modifies the way ompi_mpi_comm_world is
initialized to fill in the group with sentinel values if the number of
processes exceeds the new add_procs behavior cutoff.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds an opal hash table to keep track of mapping between
process identifiers and ompi_proc_t's. This hash table is used by the
ompi_proc_by_name() function to lookup (in O(1) time) a given
process. This can be used by a BTL or other component to get a
ompi_proc_t when handling an incoming message from an as yet unknown
peer.
Additionally, this commit adds a new MCA variable to control the new
add_procs behavior: mpi_add_procs_cutoff. If the number of ranks in
the process falls below the threshold a ompi_proc_t is created for
every process. If the number of ranks is above the threshold then a
ompi_proc_t is only created for the local rank. The code needed to
generate additional ompi_proc_t's for a communicator is not yet
complete.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit contains the following changes:
- pml/ob1: use the bml accessor function when requesting a bml
endpoint. this will ensure that bml endpoints are only created when
needed. for example, a bml endpoint is not requested and not
allocated when receiving an eager message from a peer.
- pml/ob1: change the pml_procs array in the ob1 communicator to a
proc pointer array. at the cost of a single level of extra
redirection this will allow us to allocate pml procs on demand.
- pml/ob1: add an accessor function to access the pml proc structure
for a given peer. this function will allocate the proc if it
doesn't already exist.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit contains the following changes:
- bml: add a function to add a single process. this function is
intended to remove the need to maintain a opal_bitmap_t as it is
irrelevant for a single proc. BTLs will need to be updated to
either 1) ignore the return code from opal_bitmap_set_bit or not
call the function if the reachability bitmap is NULL.
- bml: add an inline accessor function for getting the bml endpoint
for a peer proc. this function will either 1) return the cached bml
endpoint, or 2) create the endpoint and call add_proc will all
available BTL modules.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
As of v15.7, the PGI Fortran compiler does not properly support how
Open MPI uses the "USE ... ONLY" Fortran syntax to include modules
with conflicting symbol definitions (interestingly, pgfortran only has
a problem with this when compiling with -g).
In short, OMPI uses "USE :: module_aaa, ONLY: foo" and "USE ::
module_bbb, ONLY: bar" to use modules aaa and bbb, even though they
contain conflicting definitions for some symbols. However, the use of
the ONLY clause should preclude the inclusion of the conflicting
symbols -- as the word implies, it should direct the compiler to
*only* use the symbols identified by the clause (i.e., foo and bar, in
this example).
This commit adds a configure test for this capability. If the
compiler fails to build a simple test that mimics this behavior, then
disable the mpi_f08 bindings.
Fixesopen-mpi/ompi#857
After long debugging, I found last week the reason this optimization originally broke
some hdf5 tests. We now pass the hdf5 test suite with the optimization being actively used.
Specifically:
- reduce the number of realloc's and malloc's by moving
some arrays out of the cycle loop, if we know that there
size is not changing
- store the rank of the aggregator in a separate variable to avoid
continuous dereferencing
- change the wait_all logic in write_all to use a fix number of requests
(even if they are MPI_REQUEST_NULL)
- fix the timing to considere the two initial allgather and the one
allgatherv operation to be a part of it
- add more comments.