Change ompi_mtl_ofi_get_endpoint() to call the active PML's add_procs()
rather than the OFI MTL add_procs() directly when discovering a new
process during operation.
Functionally, this has no impact in correct operation. However, the
current behavior means that the heterogenous and active PML checks
are not being executed in the dynamic discovery case.
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
also add common verbose variable.
Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized. The BTL's are registered/initialized prior to the MTL components even getting registered.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
- added detection of new API into configuration
- added tag_send call implemented using new API
- added MPI_Send/MPI_Isend/MPI_Recv/MPI_Irecv implementations
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
Consolidate the ompi_process_info and opal_process_info structs to
remove duplicate storage and conversion issues. Unwind some interweaving
of include files using opal.h. Silence a couple of warnings.
For now, set the arch to local if PMIX_ARCH is not found.
Signed-off-by: Ralph Castain <rhc@pmix.org>
For direct modex, all procs publish the selected pml module
and then at add_procs pml module for each proc is checked
against every other proc in the add_proc call.
For full modex, there is no change in functionality. Only Rank0
publishes its selected pml, all other procs in the add_proc call
check their selected pml against Rank0.
If pml's do not match, throw error and exit.
Signed-off-by: Dipti Kothari <dkothar@amazon.com>
Pass the correct ompi_proc_t and array length to
mca_pml_base_pml_check_selected() during dynamic modex.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:
https://github.com/pmodels/argobotshttps://github.com/Qthreads/qthreads
The default threading model is pthreads. Alternate thread models are
specificed at configure time using the --with-threads=X option.
The framework is static. The theading model to use is selected at
Open MPI configure/build time.
mca/threads: implement Argobots threading layer
config: fix thread configury
- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check
If the poll time is too long, MPI hangs.
This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.
Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.
rework threads MCA framework configury
It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.
qthreads module
some argobots cleanup
Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Change ompi_mtl_ofi_get_endpoint() to call the active PML's
add_procs() rather than the OFI MTL add_procs() directly when
discovering a new process during operation.
Functionally, this has no impact in correct operation. However,
the current behavior means that the heterogenous and active PML
checks are not being executed in the dynamic discovery case.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
We currently save the hostname of a proc when we create the ompi_proc_t for it. This was originally done because the only method we had for discovering the host of a proc was to include that info in the modex, and we had to therefore store it somewhere proc-local. Obviously, this ccarried a memory penalty for storing all those strings, and so we added a "cutoff" parameter so that we wouldn't collect hostnames above a certain number of procs.
Unfortunately, this still results in an 8-byte/proc memory cost as we have a char* pointer in the opal_proc_t that is contained in the ompi_proc_t so that we can store the hostname of the other procs if we fall below the cutoff. At scale, this can consume a fair amount of memory.
With the switch to relying on PMIx, there is no longer a need to cache the proc hostnames. Using the "optional" feature of PMIx_Get, we restrict the retrieval to be purely proc-local - i.e., we retrieve the info either via shared memory or from within the proc-internal hash storage (depending upon the active PMIx components). Thus, the retrieval of a hostname is purely a local operation involving no communication.
All RM's are required to provide a complete hostname map of all procs at startup. Thus, we have full access to all hostnames without including them in a modex or having to cache them on each proc. This allows us to remove the char* pointer from the opal_proc_t, saving us 8-bytes/proc.
Unfortunately, PMIx_Get does not currently support the return of a static pointer to memory. Thus, even though PMIx has the hostname in its memory, it can only return a malloc'd version of it. I have therefore ensured that the return from opal_get_proc_hostname is consistently malloc'd and free'd wherever used. This shouldn't be a burden as the hostname is only used in one of two circumstances:
(a) in an error message
(b) in a verbose output for debugging purposes
Thus, there should be no performance penalty associated with the malloc/free requirement. PMIx will eventually be returning static pointers, and so we can eventually simplify this method and return a "const char*" - but as noted, this really isn't an issue even today.
Signed-off-by: Ralph Castain <rhc@pmix.org>
This commit fixes an issue with the include usage in some
ompi source files. These source files are using the <> form
of include when the "" form is correct (as these are internal,
**not** system headers).
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
- Port memchecker call from a1d502c.
- Remove unused memcheck macro variables.
- Some code readability improvements.
- Remove some stray +1's in dynamic comm cleanup.
- Re-add OPAL_ENABLE_DEBUG macro to osc header.
- Cleanup some printf's, and includes.
- Refactor cleanup of dpm_disconnect_objs.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
Temporary solution for the PML inconsistency issue discussed in #7475.
This patch address 2 things: first it make the PMIx key optional so that
if we are not in a full modex mode we don't do a direct modex, and
second it get the PML info from the vpid 0 instead of from the local
rank.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
1- Remove the common symbols issue: global variable not initialized. (#7424)
Move the variables to local scope within the set_info function.
2- Remove GPFS hints using datashipping: not used anymore
3- Redirect output stream to corresponding fs framework.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
The communicator might be not existent yet when mca_fs_gpfs_component_file_query() is called.
Therefore, we need to check it first before calling brodcast function.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
This commit changes the behavior of the individual sharedfp component. If
the component cannot create either the datafile or the metadatafile during File_open,
no error is being raised going forward. This allows applications that do not use shared
file pointer operations to continue execution without any issue.
If the user however subsequently calls MPI_File_write_shared or similar operations, an error
will be raised.
Fixes issue #7429
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
This commit fixes an issue with non-debug builds where adding an
attachment to the attachment list doesn't actually happen. This
causes all MPI_Win_detach calls to fail. The call was within an
assert which is optimized out in optimized builds.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>