1
1
Граф коммитов

10709 Коммитов

Автор SHA1 Сообщение Дата
Michael Heinz
f10305a49f Add check for PSM2 reference counting to PSM2 MTL #7721
As discussed, a feature is being added to libpsm2 to correctly handle
the case where the library is opened by multiple OMPI transports in the same
process. (For example, the OFI BTL and the PSM2 MTL).

* Improved error message to indicate required libpsm2 version.

* Adds a test at autogen/configure time for the existence of
  PSM2_LIB_REFCOUNT_CAP.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
2020-05-18 15:25:22 -04:00
Michael Heinz
dbbdb8f2e2
Merge pull request #7621 from jsquyres/pr/remove-osc-pt2pt
Remove OSC pt2pt component
2020-05-08 12:43:57 -04:00
Brian Barrett
0dc2325297
Merge pull request #7641 from dancejic/multi-NIC
Added multi-NIC support to provider selection
2020-05-07 15:24:41 -07:00
Jeff Squyres
9afe58643e
Merge pull request #7600 from jsquyres/pr/mpit-general-docs
MPI_T general docs
2020-05-07 10:11:40 -04:00
Ralph Castain
42b3541242
Update mtl_psm2.c
Track change in PMIx

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-06 17:50:45 -07:00
Michael Heinz
c55c9e67f4 PSM2 update to use PRRTE instead of ORTE
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
2020-05-06 16:16:27 -04:00
Jeff Squyres
70993e1670 Move "MPI" and "OpenMPI" man pages to section 5
Make the main man page be Open-MPI(5), and set nroff-native aliases
for MPI(5) and OpenMPI(5).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-05-02 12:45:32 -07:00
Jeff Squyres
7ace873b50 Add MPI_T.5 man page for Open MPI-specific info
Also added infrastructure to have developers write man pages in
Markdown (vs. nroff).  Pandoc >=v1.12 is used to convert those
Markdown files into actual nroff man pages.

Dist tarballs will contain generated nroff man pages; we don't want to
require users to have Pandoc installed.  Anyone who builds Open MPI
from a git clone will need to have Pandoc installed (similar to how we
treat Flex).  You can opt out of Open MPI's Pandoc-generated man pages
by configuring Open MPI with --disable-man-pages.  This will also
disable "make dist" (i.e., "make dist" will error if you configured
with --disable-man-pages).

Also removed the stuff to re-generate man pages.

This commit also:

1. Includes a new man page, written in Markdown
   (ompi/mpi/man/man5/MPI_T.5.md) that contains Open MPI-specific
   information about MPI_T.
2. Includes a converted ompi/mpi/man/man3/MPI_T_init_thread.3.md (from
   MPI_T_init_thread.3in -- i.e., nroff) just to show that Markdown
   can be used throughout the Open MPI code base for man pages.
3. Made the Makefiles in ompi/mpi/man/man?/ be full-fledged
   Makefile.am's (vs. Makefile.extras that are designed to be included
   in ompi/Makefile.am).  It is more convenient to test generation /
   installation of man pages when you can "make" and "make install" in
   their respective directories (vs. doing a build / install for the
   entire ompi project).
4. Removed logic from ompi/Makefile.am that re-generated man pages if
   opal_config.h changes.

Other man pages -- hopefully all of them! -- will be converted to
Markdown over time.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-05-02 12:45:31 -07:00
Yossi Itigin
b61bf9a00a
Merge pull request #7349 from hoopoepg/topic/ucx-new-api-nbx
OPAL/UCX: enabling new API provided by UCX
2020-05-02 14:30:44 +03:00
Ralph Castain
f608575eec
Remove references to numa_rank
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-01 13:32:29 -07:00
Ralph Castain
86709b1c80
Fix PMIx_Fence call signature
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-01 12:27:42 -07:00
Sergey Oblomov
75bda25ddb OPAL/UCX: enabling new API provided by UCX
- added detection of new API into configuration
- added tag_send call implemented using new API
- added MPI_Send/MPI_Isend/MPI_Recv/MPI_Irecv implementations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-05-01 17:58:29 +03:00
Nikola Dancejic
167d75b42a common/ofi: Added multi-NIC support to provider selection
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
2020-05-01 01:05:13 +00:00
Ralph Castain
bd29ab0ae9
Update dpm to handle deprecation of MPI_Info keys
Deprecate the current OMPI-specific MPI_Info key definitions for
MPI_Comm_spawn and replace them with their PMIx equivalents. Issue a
deprecation/conversion warning as this is done. Also issue deprecation
warnings for options such as "ompi_non_mpi" that are no longer used.

Handle both cases where the user might pass either the PMIx attribute
name itself (e.g., "PMIX_MAPBY") or the string value of the attribute
(e.g., PMIX_MAPBY, which translates to "pmix.mapby"). This can only be
done for PMIx v4 and above, so protect that code.

Silence a couple of Coverity warnings and add a test along the way.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-29 14:56:38 -07:00
Brian Barrett
4f03f44ced
Merge pull request #7582 from dipti-kothari/pml_check
mca/pml: PML check for direct modex
2020-04-27 12:29:11 -07:00
Austen Lauria
2e22a247bb
Merge pull request #7650 from devreal/fix-7617-oscpt2pt-leak
PT2PT osc: don't extra retain datatype
2020-04-24 08:55:28 -04:00
Austen Lauria
9f2f98e3ec
Merge pull request #7651 from devreal/fix-7617-oscrdma-complete_atomic
RDMA osc: remove extra retain on pending_op
2020-04-24 08:55:08 -04:00
Ralph Castain
91be01beb2
Merge pull request #7652 from rhc54/topic/het
Cleanup heterogeneous builds
2020-04-22 16:20:06 -07:00
Ralph Castain
6d29bbfde8
Cleanup heterogeneous builds
Consolidate the ompi_process_info and opal_process_info structs to
remove duplicate storage and conversion issues. Unwind some interweaving
of include files using opal.h. Silence a couple of warnings.

For now, set the arch to local if PMIX_ARCH is not found.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-22 12:46:27 -07:00
Dipti Kothari
5418cc56dd mca/pml: PML check for direct modex
For direct modex, all procs publish the selected pml module
and then at add_procs pml module for each proc is checked
against every other proc in the add_proc call.
For full modex, there is no change in functionality. Only Rank0
publishes its selected pml, all other procs in the add_proc call
check their selected pml against Rank0.
If pml's do not match, throw error and exit.

Signed-off-by: Dipti Kothari <dkothar@amazon.com>
2020-04-22 16:25:01 +00:00
Joseph Schuchart
de67ada442 RDMA osc: remove extra retain on pending_op
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 22:49:48 +02:00
Joseph Schuchart
07d1011afe OSC base: fix typos in documentation
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 21:53:36 +02:00
Joseph Schuchart
154cf571b6 OSC base: do not retain datatype by default
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-04-21 21:53:10 +02:00
William Zhang
771f9c011d coll/tuned: Add NULL check to prevent segfault
Signed-off-by: William Zhang <wilzhang@amazon.com>

cr https://code.amazon.com/reviews/CR-23837553
2020-04-21 17:53:46 +00:00
William Zhang
50640402ab coll/tuned: Fix typos
Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-04-21 17:39:37 +00:00
Nikola Dancejic
3637443454 adding NUMA_RANK to process metadata
adding PMIX_NUMA_RANK info to process metadata so that the local NUMA
rank can be accessed through the opal_process_info object.

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
2020-04-20 22:02:47 +00:00
Ralph Castain
6635795911
Fix intercomm operations
The locality for remote procs is not provided as it is only a local
concept. Thus, you must _always_ use modex_recv_optional to ensure you
don't hang waiting for a response until dmodex times out.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-15 17:04:22 -07:00
Ralph Castain
de2d69ca24
Fix hetero builds
Add missing variable declaration

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-15 08:46:21 -07:00
Jeff Squyres
8999bae25e Remove OSC pt2pt component
Per https://github.com/open-mpi/ompi/wiki/5.0.x-FeatureList, remove
the OSC pt2pt component.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-04-13 12:29:54 -07:00
Jeff Squyres
b2e0957d6f
Merge pull request #7610 from bosilca/topic/fix_MPI_T
Follow the MPI_T guidelines on return errors.
2020-04-12 14:12:32 -04:00
Ralph Castain
a210f8046f
Cleanup ompi/dpm operations
Do some code cleanup in the connect/accept code. Ensure that the OMPI
layer has access to the PMIx identifier for the process. Add macros for
converting PMIx names to/from strings. Cleanup a few of the simple test
programs. Add a little more info to a btl/tcp error message.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-04-08 08:37:25 -07:00
George Bosilca
f4af1848c9
Follow the MPI_T guidelines on return errors.
As indicated in the MPI3.2 document 14.3.10 page 599 line 1, the only
MPI error code possible is MPI_SUCCESS. All other errors must be in the
error class MPI_T_ERR*.
Fix the return of few pvar/cvar function that failed to correctly
convert to an MPI error code.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2020-04-08 00:02:45 -04:00
Jeff Squyres
9687d5e867 Upgrade all www.open-mpi.org URLs to https
Found a handful of other URLs that weren't https-ized, so I updated
them, too (after verifying that they support https, of course).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-04-02 10:43:50 -04:00
Yossi Itigin
5dcd1f4e6c
Merge pull request #7575 from yosefe/topic/pml-ucx-fix-usage-of-mca-pml
pml/ucx: Fix usage of mca_pml_base_pml_check_selected()
2020-03-30 20:06:12 +03:00
Nathan Hjelm
160ff188b8
Merge pull request #7169 from hjelmn/fix_what_wg21_calls_our_problem_not_theirs_seriously__in_some_ways_they_are_correct_but_wtf
configure: use -iquote for non-system include paths
2020-03-30 09:22:54 -07:00
Yossi Itigin
124f0c0d1f pml/ucx: Fix usage of mca_pml_base_pml_check_selected()
Pass the correct ompi_proc_t and array length to
mca_pml_base_pml_check_selected() during dynamic modex.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2020-03-29 17:46:45 +03:00
Howard Pritchard
f136a20cae
Merge pull request #6578 from hppritcha/topic/thread_framework2
Implement a MCA framework for threads
2020-03-27 15:55:48 -06:00
Austen Lauria
8a624ab613
Merge pull request #7523 from mkurnosov/fix-bcast-scatterallgather
Fix Bcast scatter_allgather
2020-03-27 14:17:53 -04:00
Shintaro Iwasaki
a7ea0d9bd7 ompi/request: move REQUEST constants from mca/threads to ompi/request
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-03-27 10:16:04 -06:00
Noah Evans
ee3517427e Add threads framework
Add a framework to support different types of threading models including
user space thread packages such as Qthreads and argobot:

https://github.com/pmodels/argobots

https://github.com/Qthreads/qthreads

The default threading model is pthreads.  Alternate thread models are
specificed at configure time using the --with-threads=X option.

The framework is static.  The theading model to use is selected at
Open MPI configure/build time.

mca/threads: implement Argobots threading layer

config: fix thread configury

- Add double quotations
- Change Argobot to Argobots
config: implement Argobots check

If the poll time is too long, MPI hangs.

This quick fix just sets it to 0, but it is not good for the
Pthreads version. Need to find a good way to abstract it.

Note that even 1 (= 1 millisecond) causes disastrous performance
degradation.

rework threads MCA framework configury

It now works more like the ompi/mca/rte configury,
modulo some edge items that are special for threading package
linking, etc.

qthreads module
some argobots cleanup

Signed-off-by: Noah Evans <noah.evans@gmail.com>
Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-03-27 10:15:45 -06:00
Brian Barrett
64d70b3076 ofi: Call add_procs through PML
Change ompi_mtl_ofi_get_endpoint() to call the active PML's
add_procs() rather than the OFI MTL add_procs() directly when
discovering a new process during operation.

Functionally, this has no impact in correct operation.  However,
the current behavior means that the heterogenous and active PML
checks are not being executed in the dynamic discovery case.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-03-27 06:06:42 -07:00
Ralph Castain
c704ed4cc5
Merge pull request #7554 from rhc54/topic/proc1
ompi_proc_t size reduction: part 1
2020-03-26 13:23:06 -07:00
Ralph Castain
33ab928e1b ompi_proc_t size reduction: part 1
We currently save the hostname of a proc when we create the ompi_proc_t for it. This was originally done because the only method we had for discovering the host of a proc was to include that info in the modex, and we had to therefore store it somewhere proc-local. Obviously, this ccarried a memory penalty for storing all those strings, and so we added a "cutoff" parameter so that we wouldn't collect hostnames above a certain number of procs.

Unfortunately, this still results in an 8-byte/proc memory cost as we have a char* pointer in the opal_proc_t that is contained in the ompi_proc_t so that we can store the hostname of the other procs if we fall below the cutoff. At scale, this can consume a fair amount of memory.

With the switch to relying on PMIx, there is no longer a need to cache the proc hostnames. Using the "optional" feature of PMIx_Get, we restrict the retrieval to be purely proc-local - i.e., we retrieve the info either via shared memory or from within the proc-internal hash storage (depending upon the active PMIx components). Thus, the retrieval of a hostname is purely a local operation involving no communication.

All RM's are required to provide a complete hostname map of all procs at startup. Thus, we have full access to all hostnames without including them in a modex or having to cache them on each proc. This allows us to remove the char* pointer from the opal_proc_t, saving us 8-bytes/proc.

Unfortunately, PMIx_Get does not currently support the return of a static pointer to memory. Thus, even though PMIx has the hostname in its memory, it can only return a malloc'd version of it. I have therefore ensured that the return from opal_get_proc_hostname is consistently malloc'd and free'd wherever used. This shouldn't be a burden as the hostname is only used in one of two circumstances:

(a) in an error message
(b) in a verbose output for debugging purposes

Thus, there should be no performance penalty associated with the malloc/free requirement. PMIx will eventually be returning static pointers, and so we can eventually simplify this method and return a "const char*" - but as noted, this really isn't an issue even today.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-23 12:49:44 -07:00
Ralph Castain
9bb06d0077
Merge pull request #7559 from rhc54/topic/fixes
Bunch of fixes plus PMIx/PRRTE updates
2020-03-23 12:49:18 -07:00
Ralph Castain
95dacd2086
Fix singletons and ensure adequate PMIx version
OMPI can only support PMIx v3 and above. PRRTE requires at least PMIx
v4, so protect against the case where OMPI is built against an external
PMIx v3.

Fix check of PMIx_Init return code for singleton operations.

Ensure that the PMIx framework gets properly opened.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-23 10:29:42 -07:00
Austen Lauria
ecd20ddcac Revert "Ensure we get our local topology"
Per the devel mailing list, we discussed the need/desirability of this change. Here is the logic behind not including it:

If you call "hwloc_topology_load", then hwloc merrily does its discovery and slams many-core systems. If you call "opal_hwloc_get_topology", then that is fine - it checks if we already have it, tries to get it from PMIx (using shared mem for hwloc 2.x), and only does the discovery if no other method is available.

We previously decided to let those who need the topology call "opal_hwloc_get_topology" to ensure the topo was available so that we don't load it unless someone actually needs it - in the case where it isn't available via PMIx, this avoids paying the startup time and memory footprint penalties for no reason. The function is protected so it will simply return SUCCESS if the topology is already defined.

After discussion, it was decided to stick with that "only setup the topology if someone actually needs it" approach. Hence, we will not blanket init the topology, and the mtl/ofi component will call opal_hwloc_get_topology to ensure the topo has been defined prior to using it.

Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2020-03-23 11:15:47 -04:00
Austen Lauria
cf5ca14f7a
Merge pull request #7547 from rhc54/topic/hwloc
Ensure we get our local topology
2020-03-23 10:13:02 -04:00
Austen Lauria
391370a05c
Merge pull request #7546 from devreal/egrequest-obj-retain
grequestx: fix racy initialization and premature destruction
2020-03-23 10:12:32 -04:00
Ralph Castain
2979bb2ce8
Update PMIx and PRRTE to reduce mpirun complexity
Use "prte" instead of "prun" for proxy execution of cmds like mpirun.
This avoids the fork/exec-rendezvous complexities and should result in
more reliable operation.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-20 13:49:12 -07:00
Ralph Castain
98893a530b
Ensure we get our local topology
Restore missing call to get_topology - others were doing it in their
components as repeated calls just return success, but let's ensure it is
always present.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-20 09:28:20 -07:00