openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	952726c121	Update to latest PMIx master - equivalent to 2.0rc2. Update the thread support in the opal/pmix framework to protect the framework-level structures. This now passes the loop test, and so we believe it resolves the random hangs in finalize. Changes in PMIx master that are included here: * Fixed a bug in the PMIx_Get logic * Fixed self-notification procedure * Made pmix_output functions thread safe * Fixed a number of thread safety issues * Updated configury to use 'uname -n' when hostname is unavailable Work on cleaning up the event handler thread safety problem Rarely used functions, but protect them anyway Fix the last part of the intercomm problem Ensure we don't cover any PMIx calls with the framework-level lock. Protect against NULL argv comm_spawn Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-06-20 09:02:15 -07:00
Ralph Castain	8f526968c2	Do not hang if we cannot relay messages. Eliminate extra error log message Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-06-05 06:35:19 -07:00
Ralph Castain	9d6b929894	Fix uninitialized variable. Set exit codes for failed launch so we get pretty error messages Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-05-31 07:38:37 -07:00
Ralph Castain	9f60cd0fe7	Update the connect/accept support so we check to see if we have the proper infrastructure and RTE support, including whether we have ompi-server available if the connect/accept spans multiple applications. Print pretty help messages in all cases where we do not have support Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-05-27 10:47:08 -07:00
Ralph Castain	657e701c65	Add debug verbosity to the orte data server and pmix pub/lookup functions Start updating the various mappers to the new procedure. Remove the stale lama component as it is now very out-of-date. Bring round_robin and PPR online, and modify the mindist component (but cannot test/debug it). Remove unneeded test Fix memory corruption by re-initializing variable to NULL in loop Resolve the race condition identified by @ggouaillardet by resetting the mapped flag within the same event where it was set. There is no need to retain the flag beyond that point as it isn't used again. Add a new job attribute ORTE_JOB_FULLY_DESCRIBED to indicate that all the job information (including locations and binding) is included in the launch message. Thus, the backend daemons do not need to do any map computation for the job. Use this for the seq, rankfile, and mindist mappers until someone decides to update them. Note that this will maintain functionality, but means that users of those three mappers will see large launch messages and less performant scaling than those using the other mappers. Have the mindist module add procs to the job's proc array as it is a fully described module Protect the hnp-not-in-allocation case Per path suggested by Gilles - protect the HNP node when it gets added in the absence of any other allocation or hostfile Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-05-25 18:41:27 -07:00
Ralph Castain	8b1f01dfe6	Set the default modex parameters back to full blocking modex while we continue to test and debug the slow modex - it seems to be having issues on the Cray Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-22 15:19:46 -07:00
Ralph Castain	9fc3079ac2	Implement a background fence that collects all data during modex operation The direct modex operation is slow, especially at scale for even modestly-connected applications. Likewise, blocking in MPI_Init while we wait for a full modex to complete takes too long. However, as George pointed out, there is a middle ground here. We could kickoff the modex operation in the background, and then trap any modex_recv's until the modex completes and the data is delivered. For most non-benchmark apps, this may prove to be the best of the available options as they are likely to perform other (non-communicating) setup operations after MPI_Init, and so there is a reasonable chance that the modex will actually be done before the first modex_recv gets called. Once we get instant-on-enabled hardware, this won't be necessary. Clearly, zero time will always out-perform the time spent doing a modex. However, this provides a decent compromise in the interim. This PR changes the default settings of a few relevant params to make "background modex" the default behavior: * pmix_base_async_modex -> defaults to true * pmix_base_collect_data -> continues to default to true (no change) * async_mpi_init - defaults to true. Note that the prior code attempted to base the default setting of this value on the setting of pmix_base_async_modex. Unfortunately, the pmix value isn't set prior to setting async_mpi_init, and so that attempt failed to accomplish anything. The logic in MPI_Init is: * if async_modex AND collect_data are set, AND we have a non-blocking fence available, then we execute the background modex operation * if async_modex is set, but collect_data is false, then we simply skip the modex entirely - no fence is performed * if async_modex is not set, then we block until the fence completes (regardless of collecting data or not) * if we do NOT have a non-blocking fence (e.g., we are not using PMIx), then we always perform the full blocking modex operation. * if we do perform the background modex, and the user requested the barrier be performed at the end of MPI_Init, then we check to see if the modex has completed when we reach that point. If it has, then we execute the barrier. However, if the modex has NOT completed, then we block until the modex does complete and skip the extra barrier. So we never perform two barriers in that case. HTH Ralph Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-04-21 10:29:23 -07:00
Gilles Gouaillardet	6ef281e163	pmix/base: fix misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Ralph Castain	884fb7fcf2	Update the PMIx2 support to include the latest shared memory optimizations Update ORTE support for dynamic PMIx operations e.g., PMIx_Spawn Update to track master Ensure that --disable-pmix-dstore actually disables the dstore. Sync to a few debugger updates Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 15:00:10 -08:00
Howard Pritchard	3049848731	Fix pmix base help file. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2016-12-02 15:03:22 -06:00
Howard Pritchard	703b464c03	pmix: fix a typo in a help file Fixes #2391 Thanks to @njoly for reporting Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2016-11-12 11:49:15 -07:00
Ralph Castain	527b5c692a	Update to include extended tool support, new datatypes	2016-08-08 13:39:46 -07:00
Ralph Castain	cacb582ecd	Support timeout values when performing connect/accept operations. Bump default timeout to 10 minutes so folks have time to start the partnering application	2016-07-28 14:09:06 -07:00
Jeff Squyres	72f41d4490	pmix: replace all tabs with spaces No code or logic changes Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-07-17 15:08:33 -04:00
Artem Polyakov	8e16f47492	Merge pull request #1688 from artpol84/fix_base64 Fix base64 implementation in pmix framework.	2016-07-07 10:47:50 +06:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Artem Polyakov	725eea2819	Fix base64 implementation in pmix framework. In the commit 80f07b65f16e9538aca7fc5e124d2074e7e0b69e setting of '-' marker used as the string termination sign was moved from base64 code: from: `80f07b65f1 (diff-1b10896c267d2591dc2c08fd0542ab67L491)` to: `80f07b65f1 (diff-1b10896c267d2591dc2c08fd0542ab67R189)` However the decoding function wasn't fixed and still expects on extra byte at the end of the encoded string which leads to data truncation during extraction (was noticed on standalone code that was using base64 from OMPI).	2016-05-23 23:30:31 +06:00
Ralph Castain	4a55fba414	Fix registration of error handlers thru the pmix120 component. A thread-shift operation was hanging on the sync_event_base, which made it dependent on someone calling opal_progress. Unfortunately, a process in "sleep" or spinning outside the MPI library won't do that, and so we never complete errhandler registration.	2016-03-02 15:01:01 -08:00
Ralph Castain	810f2446b7	Add pmix120 component, update the error handling functions in the PMIx API. Update the configure logic for the new pmix120 component ckpt Get the pmix120 component to work - still not really registering or handling notifications, but infrastructure now operates Cleanup some of the symbol scopes, and provide a more comprehensive rename.h file. Will pretty it up later - let's see how this works Cleanup the rename files to use the pretty macros	2015-12-28 23:15:44 +09:00
Gilles Gouaillardet	f0e3e16f49	pmix/base: add missing #include <unistd.h> Thanks Marco Atzeri for contributing the original patch	2015-12-24 14:41:52 +09:00
Ralph Castain	3a56f0d34b	Create the pmix external component. Fix a few places where opal/util/argv.h were required when building with an external pmix (go figure). NOTE: Building with external pmix requires that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated. Closes #1204 (replaces it) Fixes #1064	2015-12-15 15:26:13 -08:00
Ralph Castain	267ca8fcd3	Cleanup the PMIx direct modex support. Add an MCA parameter pmix_base_async_modex that will cause the async modex to be used when set to 1. Default it to 0 for now to continue current default behavior. Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_ modex is used.	2015-10-27 17:31:56 -07:00
Nathan Hjelm	8b5810f7f7	mca/base: add priority output to mca_base_select The mca_base_select function uses returned priorities to select the best component/module. This priority may be of use to the caller so pass that information back in an optional argument. If the priority is not needed pass NULL. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-10-19 12:32:41 -06:00
Ralph Castain	1b7930ad52	Silence some warnings and address Coverity issues	2015-09-16 07:58:22 -07:00
Ralph Castain	c1bbbb5e2f	Remove the last involvement of the OOB system from the MPI layer, remove the no-longer-needed usock/oob component, and have procs no longer open the RML, OOB, ROUTED, and GRPCOMM frameworks as PMIx now provides all required app-mpirun cmds	2015-09-15 13:08:35 -07:00
Ralph Castain	1cdb86b8c7	Cleanup s1 and s2 components, and ensure that mpirun and orteds only use non-direct-launch pmix components.	2015-09-08 18:37:09 -07:00
Ralph Castain	37c3ed68e7	Cleanup connect/disconnect and bring comm_spawn back online!	2015-09-06 10:27:39 -07:00
Ralph Castain	a772b46c15	Bring the MPI_Publish and friends online	2015-09-02 12:04:07 -07:00
Ralph Castain	cf6137b530	Integrate PMIx 1.0 with OMPI. Bring Slurm PMI-1 component online Bring the s2 component online Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways. Bring the OMPI pubsub/pmi component online Get comm_spawn working again Ensure we always provide a cpuset, even if it is NULL pmix/cray: adjust cray pmix component for pmix Make changes so cray pmix can work within the integrated ompi/pmix framework. Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet Cleanup comm_spawn - procs now starting, error in connect_accept Complete integration	2015-08-29 16:04:10 -07:00
Ralph Castain	a2243dcddd	Add an opal/errhandler so opal-level errors can be up-leveled	2015-07-11 07:09:11 -07:00
Ralph Castain	869041f770	Purge whitespace from the repo	2015-06-23 20:59:57 -07:00
Nathan Hjelm	7b7993e406	pmix/base: fix coverity issue CID 1269707 Logically dead code (DEADCODE) Coverity is correct that tmp3 can never be NULL here. Deleted the dead code. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-05-29 09:02:56 -06:00
Howard Pritchard	0980423c5f	pmix/base: fix coverity error Remove some obviously dead code and thus fix a coverity error - CID #133 Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2015-05-16 13:24:03 -06:00
Gilles Gouaillardet	0ce59f2d29	pmix: fix misc memory leaks as reported by Coverity as CID 1269843, 1269854, 1269856, 1269857 and 1269858	2015-02-16 11:19:43 +09:00
Jeff Squyres	0dbbffb753	pmix_base_frame: use the "= { 0 }" initializer Per open-mpi/ompi#381, convert the specific intialization of opal_pmix to use the generic "= { 0 }" initializer. This form can be used to initialize any type when the intent is just to zero out / assign some value.	2015-02-05 17:51:06 -05:00
Jeff Squyres	621af3aa07	pmix_base: fix global opal_pmix symbol for static linking on OS X OS X has weirdness when static linking. If a symbol is not initialized, it is put into the common block section, and Weird Things happen (linking when trying to using that global symbol will fail). If you initialize the variable, it goes into a different section (and linking to it will work). This link (that might go stale someday) has some information about OS X linker scope and treatment of symbol definitions: https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/executing_files.html#//apple_ref/doc/uid/TP40001829-98432-TPXREF120 Fixes #375.	2015-02-04 12:12:31 -05:00
Ralph Castain	ba25e8a0ce	Fix singletons	2015-01-27 09:29:42 -06:00
Ralph Castain	780c93ee57	Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL. We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.	2014-11-11 17:00:42 -08:00
Gilles Gouaillardet	80f07b65f1	pmix: correctly split pmi messages Thanks to @elenash for all the reviews	2014-11-11 17:16:00 +09:00
Gilles Gouaillardet	7508c6f3ad	pmix: correctly handle NULL OPAL_BYTE_OBJECT object	2014-10-22 17:15:21 +09:00
Nadezhda Kogteva	2bce929330	MTL MXM cleanup: unnecessary OMPI_MTL_MXM_CONNECT_ON_FIRST_COMM variable removed	2014-10-20 10:29:47 +03:00
Gilles Gouaillardet	27dcca0bb2	pmi/s1: fix large keys do not overwrite the PMI key when pushing a message that does not fit within 255 bytes	2014-10-16 13:29:32 +09:00
Elena	c905fe9b78	pmix: removed pmix_base_direct modex mca parameter, renamed orte_full_modex_cutoff and ompi_hostname_cutoff to direct_modex_cutoff	2014-10-09 06:15:31 +02:00
Ralph Castain	8d0b4f222a	The pmix.get functions should not be returning "success" if the requested info isn't found. Fix the macros and the component functions so they correctly return "not found" in that situation, and set the data regions and size to NULL and 0, respectively. This commit was SVN r32818.	2014-09-30 18:03:12 +00:00
Ralph Castain	6323b226c7	Bring over some updates from the PMIx branch - mostly just minor cleanups. Make the direct grpcomm component no longer be the default. For now, we seem to be having problems with non-blocking fence operations, so make them not be the default under any scenario (e.g., when sm is the only btl in operation). This commit was SVN r32673.	2014-09-06 19:19:44 +00:00
Ralph Castain	41c6058153	Bring over changes to MXM from pmix branch: MTL MXM: establish endpoint connection on the first communication when direct_modex used This commit was SVN r32668.	2014-09-03 18:22:11 +00:00
Gilles Gouaillardet	d743da18bf	pmix: fix process name parsing on 32 bits systems opal_process_name_t is an uint64_t which is not equivalent to an unsigned long on 32 bits systems. this is now parsed as an unsigned long long. This commit was SVN r32592.	2014-08-25 03:08:02 +00:00
Ralph Castain	aec5cd08bd	Per the PMIx RFC: WHAT: Merge the PMIx branch into the devel repo, creating a new OPAL “lmix” framework to abstract PMI support for all RTEs. Replace the ORTE daemon-level collectives with a new PMIx server and update the ORTE grpcomm framework to support server-to-server collectives WHY: We’ve had problems dealing with variations in PMI implementations, and need to extend the existing PMI definitions to meet exascale requirements. WHEN: Mon, Aug 25 WHERE: https://github.com/rhc54/ompi-svn-mirror.git Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding. All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level. Accordingly, we have: * created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations. * Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported. * Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint * removed the prior OMPI/OPAL modex code * added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform. * retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand This commit was SVN r32570.	2014-08-21 18:56:47 +00:00

48 Коммитов