openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	1a0bccb536	Now that PMIx has settled on its release strategy and numbering, update the OPAL pmix framework to track Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-02 15:44:43 -08:00
Ralph Castain	af67f16422	Update configury to support multiple PMIx versions, rename pmix2x component to pmix3x for support of PMIx master Update support for external v1.1.x and v2.x libraries. Minor corrections to the v3.x component	2016-08-25 18:19:05 -07:00
Ralph Castain	639dbdb7ea	For maintainability, fold the external PMIx 2.x integration into the internal PMIx 2.x library component. This ensures that we always stay in sync with the two as that is becoming a problem.	2016-08-22 13:28:55 -07:00
Ralph Castain	12ecf972af	Split the pmix external component into one for the 1.1.4 release, and another for the upcoming 2.0 release. Clean up the configury so the components look for a series-specific function instead of running a program. NOTE: the changes for the 2.0 series are not yet in the PMIx master.	2016-06-01 14:15:24 -07:00
Ralph Castain	7b115a9e0b	Patch from Gilles - modify detection of PMIx version for external libraries	2016-05-30 14:30:10 -07:00
Ralph Castain	55923eacd3	Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize) Rename temp vars in .m4 to avoid conflict with Travis	2016-05-27 08:06:31 -07:00
Ralph Castain	9a5ef60602	Ensure we fail to configure if external PMIx was requested and is not found.	2016-05-11 07:59:05 -07:00
Jeff Squyres	f3e3b800e9	opal_check_pmi.m4: remove stale code and comments Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-05-02 12:51:01 -04:00
Gilles Gouaillardet	08d91b9a03	pmix/external: revamp external pmix package detection	2016-05-02 16:23:31 +09:00
Ralph Castain	ddf0f272e1	Fix typo	2015-12-29 07:04:28 -08:00
Gilles Gouaillardet	55ae6768d3	configury: use --with-pmix option instead of --with-external-pmix	2015-12-28 23:14:59 +09:00
Ralph Castain	3a56f0d34b	Create the pmix external component. Fix a few places where opal/util/argv.h were required when building with an external pmix (go figure). NOTE: Building with external pmix requires that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated. Closes #1204 (replaces it) Fixes #1064	2015-12-15 15:26:13 -08:00
Nysal Jan K.A	2d2ea63231	Fix PMI and PMI2 builds	2015-08-11 21:13:45 +05:30
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Jeff Squyres	0166318966	opal_check_pmi: protect un-prefixed shell variables Since there's unfortunately only a global namespace for shell variables, we need to protect un-prefixed shell variables with OPAL_VAR_SCOPE_PUSH/POP.	2015-03-13 04:48:31 -07:00
Ralph Castain	1c1d9f69f6	Update the PMI configury to support addition of --with-pmi-libdir option for environments that install the PMI libs in a non-default location	2015-02-27 10:48:28 -08:00
Howard Pritchard	67b0f8de7f	separate out cray pmi config The mixing of the Slurm PMI and Cray PMI configure was getting messy and dangerous - developers working on Slurm PMI often don't have access to Cray PMI, etc. This mod pulls out the Cray PMI configure into a separate m4 file. Cray pmi is now configured as follows: 1) on Cray CLE 5 and higher, Cray PMI is auto detected. pkg-config is used to resolve the necessary CPP flags, link flags and libs, etc. Nothing needs to be added to the configure line to pick up Cray PMI. 2) on legacy Cray CLE 4 systems with PMI 4.X, Cray PMI is also auto detected. 3) on legacy Cray CLE 4 systems with PMI 5.X Cray PMI can't be auto-detected owing to changes in the PMI pkg-config file which result in pkg-config returning an error owing to a dependency of PMI on newer versions of ALPS installs that are not present on CLE 4. So, for those falling in to this situation, the --with-cray-pmi=(DIR) method needs to be used. DIR specifies the Cray PMI install directory. The configure file looks for required alps libraries first in /usr/lib/alps, then in /opt/cray/xe-sysroot/default/usr/lib/alps.	2014-11-04 15:25:25 -07:00
Jeff Squyres	81dafbdba1	opal_check_pmi.m4: trivial whitespace cleanup; no code changes	2014-11-01 04:02:23 -07:00
Ralph Castain	4f0c1ae8d9	Continue cleanup of the PMI config code. Eliminate the multiple calls to check for pmi1 and pmi2 - we must check it only once to get the pmix components to build only in the correct situations. Ensure we set the wrapper flags so we handle static builds correctly.	2014-10-27 20:37:33 -07:00
Gilles Gouaillardet	8da6c14e06	configury: fix a typo in config/opal_check_pmi.m4	2014-10-24 13:31:36 +09:00
Ralph Castain	75d8a7f25b	Refactor the PMI configure check logic to be a lot cleaner and simpler.	2014-10-23 18:35:18 -07:00
Nathan Hjelm	083a659217	Correct some typos in Cray PMI detection	2014-10-14 10:28:36 -06:00
Nathan Hjelm	169a1866b8	Modify Cray PMI check to detect PMI on older systems	2014-10-09 17:01:31 -06:00
Ralph Castain	b1a58726ac	Cleanup the PMI m4 syntax with respect to -a, and look for libpmi* so we can pickup both .a, .la, and whatever other extensions that particular system might use.	2014-10-09 14:04:43 -07:00
Nadezhda Kogteva	ffa8674e01	Fix bugs in PMI configure: set correct include path, fix test command with multiple conditions.	2014-10-09 17:23:56 +03:00
Ralph Castain	9c027e6def	Update the PMI configure logic to handle the oddball case where both lib and lib64 may exist, and the required files may be in one or the other of them.	2014-10-07 10:20:46 -07:00
Ralph Castain	9e35f80ab6	Don't multiply define WANT_PMI_SUPPORT and friends. Turns out they weren't being used anywhere anyway, so no point in defining them at all This commit was SVN r32822.	2014-09-30 20:43:25 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00
Ralph Castain	3b64c603b4	First stage of RFC to rename OMPI_foo build system support: change OMPI_CHECK_PACKAGE -> OPAL_CHECK_PACKAGE This commit was SVN r31582.	2014-05-01 14:24:56 +00:00
Ralph Castain	f259d50ed7	Fully fix the PMI2 warning - turned out to be larger than originally thought due to the way the function was being handled across multiple files. Properly resolve the problem by not compiling the file if PMI2 is not desired, and then appropriately setting the visibility of the function within the module Refs trac:4400 This commit was SVN r31084. The following Trac tickets were found above: Ticket 4400 --> https://svn.open-mpi.org/trac/ompi/ticket/4400	2014-03-17 17:36:37 +00:00
Ralph Castain	af4a9a0688	Make clear that --with-pmi can/should be used to specify the path to the pmi installation since at least one person didn't realize it. cmr=v1.7.4:reviewer=jsquyres This commit was SVN r30439.	2014-01-27 22:50:37 +00:00
Ralph Castain	e6199da2e7	Fixes trac:3486 - prevent opal_check_pmi from bleeding CPPFLAGS This commit was SVN r28940. The following Trac tickets were found above: Ticket 3486 --> https://svn.open-mpi.org/trac/ompi/ticket/3486	2013-07-24 03:53:23 +00:00
Ralph Castain	5f520e241b	Ensure we get both -lpmi and -lpmi2 when the libs are separate This commit was SVN r28795.	2013-07-16 14:57:18 +00:00
Ralph Castain	e8340b6339	There is no convention out there as to how OEMs handle PMI2 functions. Some put them in their own -lpmi2 library, and some don't. Some have split the PMI2 definitions into a pmi2.h and keep the PMI-1 definitions in a separate pmi.h, and some don't. Try to handle cases more generally so at least Slurm and Cray can co-exist in peace. This commit was SVN r28672.	2013-06-26 00:43:26 +00:00
Ralph Castain	fa943dc6ff	Cleanup a few things in the revised PMI configury - we know slurm has both pmi and pmi2 libs, so just auto-detect the presence of them if the user directed us to build with pmi support. Also cleanup some changed names in the alps code This commit was SVN r28670.	2013-06-24 02:41:40 +00:00
Joshua Ladd	0b5c1f2ea8	Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd This commit was SVN r28666.	2013-06-21 15:28:14 +00:00
Ralph Castain	bd9265c560	Per the meeting on moving the BTLs to OPAL, move the ORTE database "db" framework to OPAL so the relocated BTLs can access it. Because the data is indexed by process, this requires that we define a new "opal_identifier_t" that corresponds to the orte_process_name_t struct. In order to support multiple run-times, this is defined in opal/mca/db/db_types.h as a uint64_t without identifying the meaning of any part of that data. A few changes were required to support this move: 1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution). 2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time. 3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering. 4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options. No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework. This commit was SVN r28112.	2013-02-26 17:50:04 +00:00

37 Коммитов