1
1
Граф коммитов

59 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
95dacd2086
Fix singletons and ensure adequate PMIx version
OMPI can only support PMIx v3 and above. PRRTE requires at least PMIx
v4, so protect against the case where OMPI is built against an external
PMIx v3.

Fix check of PMIx_Init return code for singleton operations.

Ensure that the PMIx framework gets properly opened.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-03-23 10:29:42 -07:00
Jeff Squyres
cb1e424359 opal_check_pmi.m4: properly save top-level flags
CPPFLAGS, LDFLAGS, and LIBS were only being saved conditionally, but
restored unconditionally.  This could result in wiping out
CPPFLAGS/LDFLAGS/LIB.

Make sure to *always* save these flags so that when they are restored,
they are restored to their proper value.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-03-17 22:15:41 -04:00
Howard Pritchard
488f656c11 fix an issue with configuring with external pmix
External pmix installs are frequently in non-standard locations and
the path to their shared libraries are not ldconfig'd in because
there may be multiple pmix installs.

The way the configury was set up prior to this patch, the configuration
would fail soon after the PMIX config stuff was called because it
added some pmix lib stuff to the LDFLAGS, resulting in configury tests
for things that require running the configure test to fail.

This patch avoids this problem by resetting the LDFLAGS and LIBS back
to what they were prior to the run of the external PMIX detection.

The CFLAGS setting is left because there are many places in the ompi
and opal source code where pmix_common.h needs to be included.

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-02-23 13:04:24 -07:00
Gilles Gouaillardet
174e967dbc
Remove ORTE project
Will be replaced by PRRTE. Ensure that OMPI and OPAL layers build
without reference to ORTE. Setup opal/pmix framework to be static.
Remove support for all PMI-1 and PMI-2 libraries. Add support for
"external" pmix component as well as internal v4 one.

remove orte: misc fixes

 - UCX fixes
 - VPATH issue
 - oshmem fixes
 - remove useless definition
 - Add PRRTE submodule
 - Get autogen.pl to traverse PRRTE submodule
 - Remove stale orcm reference
 - Configure embedded PRRTE
 - Correctly pass the prefix to PRRTE
 - Correctly set the OMPI_WANT_PRRTE am_conditional
 - Move prrte configuration to the end of OMPI's configure.ac
 - Make mpirun a symlink to prun, when available
 - Fix makedist with --no-orte/--no-prrte option
 - Add a `--no-prrte` option which is the same as the legacy
   `--no-orte` option.
 - Remove embedded PMIx tarball. Replace it with new submodule
   pointing to OpenPMIx master repo's master branch
 - Some cleanup in PRRTE integration and add config summary entry
 - Correctly set the hostname
 - Fix locality
 - Fix singleton operations
 - Fix support for "tune" and "am" options

Signed-off-by: Ralph Castain <rhc@pmix.org>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-02-07 18:20:06 -08:00
Ralph Castain
c054d4d1cc Ensure we push/pop local AC vars in the right place
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-02-21 13:28:10 -08:00
Ralph Castain
cd1b5641be Update slurm pmi configury to account for pmix
When Slurm is built against PMIx, some installations place a copy of the
PMIx library that Slurm is linking against in the Slurm PMI location.
Current configury ignores that location. The desired behavior is to look
for a PMIx lib in that location when --with-pmi is given. If the user
also specifies --with-pmix and gives a different location, then override
anything previously found and look for it where the user directed.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-02-21 11:33:35 -08:00
Ralph Castain
1bd772e8eb Remove the stale orte-dvm code
Users should migrate to https://github.com/pmix/prrte

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-10-17 15:11:38 -07:00
Ralph Castain
58ddd78760 Update PMIx detection code
Correctly pickup external 4x version, improve reporting in summary

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-10-01 15:16:25 -07:00
Ralph Castain
466cad6cb2 Update master to PMIx v4
Retain ext3x for PMIx 3  compatibility
Get the blasted permissions correct on config files

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-09-13 08:24:17 -07:00
Ralph Castain
1e6aaf7f22 Default to internal PMIx if newer than external
Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-19 11:02:03 -07:00
Ralph Castain
4a596d35f7 Remove the PMIx ext4x component
Update configury to redirect anything at or above v3 to the ext3x component

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-13 19:51:50 -07:00
Ralph Castain
aeb415a3d0 Add warning for PMIx v1.2.x - dynamic ops not supported
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-10 16:24:21 -07:00
Ralph Castain
09963affba Further detail check for external PMIx
Per today's telecon, check for supported version and do not use anything less than 1.2.x. Sadly, we don't include the last piece of the version triplet in the version file and so we cannot check for 1.2.5.

If someone explicitly points us at an external installation that isn't acceptable, then error out

Add PMIx support to summary

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-10 16:12:52 -07:00
Ralph Castain
fdca304268 Default to external PMIx installation
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-07-10 16:12:52 -07:00
Ralph Castain
48f27655a6 Sync to PMIx v3.0rc and add ext4x
Sync to the draft rc for PMIx v3.0. Add an external component for PMIx master, which is at v4.0

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-11 05:54:23 -07:00
Gilles Gouaillardet
83dd8cd3fc configury: look for PMI header in DIR provided by --with-pmi=DIR
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-02-26 09:46:26 +09:00
Gilles Gouaillardet
b86e0f04bf configury: fix PMI detection
and do not end up with -L/usr/lib[64] when PMI libraries
are installed in the default location.

Thanks Davide Vanzo for the report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-02-24 01:16:12 +09:00
Ralph Castain
01e6539127 Revert "Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR"
This reverts commit c4fe4ecfb9.

Revert "Fix  DIR, DIR/include search for --with-pmix"

This reverts commit 2e3f401763.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-17 16:02:19 -08:00
Philip Kovacs
2e3f401763 Fix DIR, DIR/include search for --with-pmix
Signed-off-by: Philip Kovacs <pkdevel@yahoo.com>
2018-01-10 17:17:48 -05:00
Ralph Castain
3906aaf41a Silence warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-11-25 11:50:18 -08:00
Gilles Gouaillardet
db2f3643d7 configury: enhance external PMIx detection - add the --with-pmix-libdir=DIR option look for libpmix.* libs in DIR, DIR/lib64 and DIR - if --with-pmix=DIR is given, look for libpmix.* in DIR/lib64 and DIR/lib
Fixes open-mpi/ompi#4347

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-17 17:03:01 +09:00
Ralph Castain
bbd83fd4c0 Add a new launcher "prun" for starting applications against the ORTE DVM.
Unlike "orterun", "prun" is a PMIx-only program that discovers the DVM connection instead of requiring that we explicitly provide it. Only build "prun" if PMIx v2.x is available.

This gets the DVM working again, but still is showing problems for multiple executions. I'll detail those in a separate issue. Thus, the DVM should still be considered "broken".

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 21:40:41 -07:00
Ralph Castain
1a0bccb536 Now that PMIx has settled on its release strategy and numbering, update the OPAL pmix framework to track
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-02 15:44:43 -08:00
Ralph Castain
af67f16422 Update configury to support multiple PMIx versions, rename pmix2x component to pmix3x for support of PMIx master
Update support for external v1.1.x and v2.x libraries. Minor corrections to the v3.x component
2016-08-25 18:19:05 -07:00
Ralph Castain
639dbdb7ea For maintainability, fold the external PMIx 2.x integration into the internal PMIx 2.x library component. This ensures that we always stay in sync with the two as that is becoming a problem. 2016-08-22 13:28:55 -07:00
Ralph Castain
12ecf972af Split the pmix external component into one for the 1.1.4 release, and another for the upcoming 2.0 release. Clean up the configury so the components look for a series-specific function instead of running a program.
NOTE: the changes for the 2.0 series are not yet in the PMIx master.
2016-06-01 14:15:24 -07:00
Ralph Castain
7b115a9e0b Patch from Gilles - modify detection of PMIx version for external libraries 2016-05-30 14:30:10 -07:00
Ralph Castain
55923eacd3 Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize)
Rename temp vars in .m4 to avoid conflict with Travis
2016-05-27 08:06:31 -07:00
Ralph Castain
9a5ef60602 Ensure we fail to configure if external PMIx was requested and is not found. 2016-05-11 07:59:05 -07:00
Jeff Squyres
f3e3b800e9 opal_check_pmi.m4: remove stale code and comments
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-02 12:51:01 -04:00
Gilles Gouaillardet
08d91b9a03 pmix/external: revamp external pmix package detection 2016-05-02 16:23:31 +09:00
Ralph Castain
ddf0f272e1 Fix typo 2015-12-29 07:04:28 -08:00
Gilles Gouaillardet
55ae6768d3 configury: use --with-pmix option instead of --with-external-pmix 2015-12-28 23:14:59 +09:00
Ralph Castain
3a56f0d34b Create the pmix external component. Fix a few places where opal/util/argv.h were required when building with an external pmix (go figure).
NOTE: Building with external pmix *requires* that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated.

Closes #1204  (replaces it)
Fixes #1064
2015-12-15 15:26:13 -08:00
Nysal Jan K.A
2d2ea63231 Fix PMI and PMI2 builds 2015-08-11 21:13:45 +05:30
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Jeff Squyres
0166318966 opal_check_pmi: protect un-prefixed shell variables
Since there's unfortunately only a global namespace for shell
variables, we need to protect un-prefixed shell variables with
OPAL_VAR_SCOPE_PUSH/POP.
2015-03-13 04:48:31 -07:00
Ralph Castain
1c1d9f69f6 Update the PMI configury to support addition of --with-pmi-libdir option for environments that install the PMI libs in a non-default location 2015-02-27 10:48:28 -08:00
Howard Pritchard
67b0f8de7f separate out cray pmi config
The mixing of the Slurm PMI and Cray PMI configure was getting
messy and dangerous - developers working on Slurm PMI often don't
have access to Cray PMI, etc.

This mod pulls out the Cray PMI configure into a separate m4 file.
Cray pmi is now configured as follows:

1) on Cray CLE 5 and higher, Cray PMI is auto detected.  pkg-config
   is used to resolve the necessary CPP flags, link flags and libs,
   etc.  Nothing needs to be added to the configure line to pick up
   Cray PMI.

2) on legacy Cray CLE 4 systems with PMI 4.X, Cray PMI is also
   auto detected.

3) on legacy Cray CLE 4 systems with PMI 5.X Cray PMI can't be auto-detected
   owing to changes in the PMI pkg-config file which result in pkg-config
   returning an error owing to a dependency of PMI on newer versions of ALPS
   installs that are not present on CLE 4. So, for those falling in to this
   situation, the --with-cray-pmi=(DIR) method needs to be used.

   DIR specifies the Cray PMI install directory.  The configure file looks
   for required alps libraries first in /usr/lib/alps, then in
   /opt/cray/xe-sysroot/default/usr/lib/alps.
2014-11-04 15:25:25 -07:00
Jeff Squyres
81dafbdba1 opal_check_pmi.m4: trivial whitespace cleanup; no code changes 2014-11-01 04:02:23 -07:00
Ralph Castain
4f0c1ae8d9 Continue cleanup of the PMI config code. Eliminate the multiple calls to check for pmi1 and pmi2 - we must check it only once to get the pmix components to build only in the correct situations. Ensure we set the wrapper flags so we handle static builds correctly. 2014-10-27 20:37:33 -07:00
Gilles Gouaillardet
8da6c14e06 configury: fix a typo in config/opal_check_pmi.m4 2014-10-24 13:31:36 +09:00
Ralph Castain
75d8a7f25b Refactor the PMI configure check logic to be a lot cleaner and simpler. 2014-10-23 18:35:18 -07:00
Nathan Hjelm
083a659217 Correct some typos in Cray PMI detection 2014-10-14 10:28:36 -06:00
Nathan Hjelm
169a1866b8 Modify Cray PMI check to detect PMI on older systems 2014-10-09 17:01:31 -06:00
Ralph Castain
b1a58726ac Cleanup the PMI m4 syntax with respect to -a, and look for libpmi* so we can pickup both .a, .la, and whatever other extensions that particular system might use. 2014-10-09 14:04:43 -07:00
Nadezhda Kogteva
ffa8674e01 Fix bugs in PMI configure: set correct include path, fix test command with multiple conditions. 2014-10-09 17:23:56 +03:00
Ralph Castain
9c027e6def Update the PMI configure logic to handle the oddball case where both lib and lib64 may exist, and the required files may be in one or the other of them. 2014-10-07 10:20:46 -07:00
Ralph Castain
9e35f80ab6 Don't multiply define WANT_PMI_SUPPORT and friends. Turns out they weren't being used anywhere anyway, so no point in defining them at all
This commit was SVN r32822.
2014-09-30 20:43:25 +00:00
Ralph Castain
aec5cd08bd Per the PMIx RFC:
WHAT:    Merge the PMIx branch into the devel repo, creating a new
               OPAL “lmix” framework to abstract PMI support for all RTEs.
               Replace the ORTE daemon-level collectives with a new PMIx
               server and update the ORTE grpcomm framework to support
               server-to-server collectives

WHY:      We’ve had problems dealing with variations in PMI implementations,
               and need to extend the existing PMI definitions to meet exascale
               requirements.

WHEN:   Mon, Aug 25

WHERE:  https://github.com/rhc54/ompi-svn-mirror.git

Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding.

All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level.

Accordingly, we have:

* created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations.

* Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported.

* Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint

* removed the prior OMPI/OPAL modex code

* added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform.

* retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand

This commit was SVN r32570.
2014-08-21 18:56:47 +00:00