Ralph Castain
9803d69d02
Ensure the embedded PMIx respects an OMPI-level --disable-debug
2015-12-01 08:00:24 -08:00
Ralph Castain
52ea538bc1
Per fix from Nysal: set the listener_active flag before starting the progress thread, and declare the flag to be volatile
2015-11-09 09:00:59 -08:00
Ralph Castain
fed28e4cfc
Add missing file that was previously ignored
2015-11-06 14:37:09 -08:00
Ralph Castain
5f446570d8
Work on cleaning up memory leaks that are causing orte-dvm to eventually run out of memory. Still don't have everything plugged, but getting better. Sync to the PMIx master that includes removal of the pmix_common.h.in file that really didn't need to be generated, and update to the PMIx_server_init API.
2015-11-06 14:15:30 -08:00
Ralph Castain
206e9a011e
Add a couple of missing translations to/from PMIx internal and OPAL error constants
2015-10-29 12:33:02 -07:00
Ralph Castain
8ad9b450c4
Silence Coverity warning
2015-10-28 20:10:28 -07:00
George Bosilca
c9d0fffab3
Add a missing include.
2015-10-28 00:50:58 -04:00
Ralph Castain
267ca8fcd3
Cleanup the PMIx direct modex support. Add an MCA parameter pmix_base_async_modex that will cause the async modex to be used when set to 1. Default it to 0 for now
...
to continue current default behavior.
Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_
modex is used.
2015-10-27 17:31:56 -07:00
George Bosilca
6c28f114f1
Silence a warning regarding the format str for snprintf.
2015-10-24 15:24:40 -04:00
Gilles Gouaillardet
0221f59197
pmix1xx configury: invoke sub-configure with CFLAGS and CPPFLAGS on the command line
...
if CFLAGS and/or CPPFLAGS are passed to the ompi configure command line, pmix1xx
configure will not use the correct ones previously passed in the environment
see discussion started at http://www.open-mpi.org/community/lists/devel/2015/10/18159.php
Thanks Siegmar Gross for bringing this to our attention
2015-10-22 10:13:52 +09:00
Ralph Castain
363f62a506
Fix singleton operations when running under a SLURM allocation. Sadly, SLURM's PMI will return success even if the PMI server isn't actually available. This leads to erroneous selection of pmix and ess components. So add a further requirement (namely, that we see a job_step envar) to the SLURM pmix components along with some modification of ess selection code to avoid the problem
2015-10-17 20:24:03 -07:00
annu13
cc5e1e26a5
sync with pmix master (repo_rev git69c398e)
2015-10-09 15:17:43 -07:00
annu13
5787e9248f
cleaned up debug stmts
2015-10-06 06:25:36 -07:00
annu13
30ba00e05d
sync with master
2015-10-06 06:04:54 -07:00
annu13
6f37c0e3e8
sync with PMIX master
2015-10-02 17:25:48 -07:00
annu13
7434c47626
sync with PMIX master
2015-10-02 17:17:48 -07:00
Ralph Castain
a4a3dfd480
Cleanup the code a bit by simply adding our nspace to the top of the list of jobid <-> nspace correlations. Add two new APIs to opal_pmix for registering new jobid/nspace pairs and retrieving an nspace given a jobid - these are required to support connect/accept. No impact on the PMIx library.
2015-09-28 08:50:13 -07:00
Ralph Castain
f713e71d51
Minor cleanup - add jobid <-> nspace in one more place
2015-09-27 14:48:39 -07:00
Ralph Castain
fad5638596
Resolve the naming issue when direct-launched by PMIx-enabled RMs using a minimal-impact approach. Detect if we were launched via ORTE - if so, then use our standard methods for computing the jobid. If not, then just hash the nspace to create the jobid, and track the jobid <-> nspace correspondece down in the opal/mca/pmix/pmix1xx component. We then do the translation any time a function that passes process names is invoked.
2015-09-27 09:57:59 -07:00
Ralph Castain
209600fe26
Sync to PMIx master
2015-09-23 21:00:30 -07:00
Ralph Castain
749bd4e6fe
Plug a few memory leaks identified by valgrind
2015-09-23 15:21:04 -07:00
Nathan Hjelm
8c4da756cf
pmix: do not touch recently freed memory
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-23 08:44:50 -06:00
Ralph Castain
4c654ffd94
Sync to PMIx master
2015-09-21 21:27:06 -07:00
Ralph Castain
c1bbbb5e2f
Remove the last involvement of the OOB system from the MPI layer, remove the no-longer-needed usock/oob component, and have procs no longer open the RML, OOB, ROUTED, and GRPCOMM frameworks as PMIx now provides all required app-mpirun cmds
2015-09-15 13:08:35 -07:00
Ralph Castain
22d7c0081a
Fix the no-disconnect test by resolving a segfault on free - opal_dss.unload will return the remaining unpacked portion of a buffer. As such, it cannot return the pointer to that info as it might be partway inside of a malloc'd region. So copy the data out of the buffer.
2015-09-11 13:01:35 -07:00
Ralph Castain
dc5796b8a1
Revert "Revert "Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local""
...
Fix the locality computation by correctly computing the vpid of the local peer
This reverts commit open-mpi/ompi@6a8fad49e5 .
2015-09-11 08:29:51 -07:00
Ralph Castain
6a8fad49e5
Revert "Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local"
...
This reverts commit f94f3cda214ab937c46802896fb53b84bec6cc3a.
2015-09-11 02:01:25 -07:00
Ralph Castain
e0a52354d4
Sync to PMIx master at open-mpi/pmix@89680d6663
...
Includes changes to support BigEndian machines
2015-09-10 20:47:40 -07:00
Ralph Castain
f94f3cda21
Fix the handling of cpusets so we get the correct cpuset for each local peer. Add the ability to indicate that a modex request is "optional" so we don't call the server if we don't find the value. Take advantage of that to allow the MPI layer to decide that the lack of locality info indicates non-local
2015-09-10 10:25:30 -07:00
Ralph Castain
4c47c498ac
Sync to latest PMIx master
...
Allow the blocking send and recv to keep trying
2015-09-09 11:48:47 -07:00
Gilles Gouaillardet
7f0ed74d24
pmix1xx: fix CPPFLAGS when DSO are not built
2015-09-09 14:20:12 +09:00
rhc54
3a446c9797
Merge pull request #876 from rhc54/topic/hnp
...
Fix segfault upon job error
2015-09-08 15:10:51 -07:00
rhc54
47f437608d
Merge pull request #875 from rhc54/topic/dynamics
...
Stop a segfault in the test by correctly passing all the argv during spawn
2015-09-08 14:35:42 -07:00
Ralph Castain
459f169e06
Fix segfault upon job error
...
Silence some unnecessary error-logs
2015-09-08 14:03:06 -07:00
Ralph Castain
ae7156cabb
Stop a segfault in the test by correctly passing all the argv during spawn
2015-09-08 13:42:46 -07:00
rhc54
8053357fcc
Merge pull request #873 from rhc54/topic/static
...
Add the libs required for PMIx to support static builds (and trim all excess whitespace)
2015-09-08 11:28:47 -07:00
Ralph Castain
291afe502f
Add the libs required for PMIx to support static builds
...
Remove unneeded CPPFLAGS
2015-09-08 10:21:06 -07:00
Jeff Squyres
bc9e5652ff
whitespace: purge whitespace at end of lines
...
Generated by running "./contrib/whitespace-purge.sh".
2015-09-08 09:47:17 -07:00
Ralph Castain
e6add86e4f
Deal with connect/accept between two jobs from different mpirun's. Somewhat optimize connect/accept by using MPI bcast to distribute the participants instead of another PMIx lookup. Cleanup some Coverity issues.
2015-09-07 09:19:24 -07:00
Ralph Castain
37c3ed68e7
Cleanup connect/disconnect and bring comm_spawn back online!
2015-09-06 10:27:39 -07:00
Ralph Castain
f6948c2bb4
Sync with PMIx master 43e45c3. Get multi-node publish/lookup/unpublish working
2015-09-04 10:07:17 -07:00
Ralph Castain
a772b46c15
Bring the MPI_Publish and friends online
2015-09-02 12:04:07 -07:00
Ralph Castain
95dbd70f44
Sync to PMIx 1.1, sha- 51479b0
2015-09-01 14:09:25 -07:00
rhc54
d8cb3fe705
Merge pull request #852 from rhc54/topic/pmix
...
Sync to PMIx tarball - includes:
2015-09-01 06:54:34 -07:00
Gilles Gouaillardet
6dfa996760
configury: fix a typo in opal/mca/pmix/pmix1xx/configure.m4
2015-09-01 14:59:07 +09:00
Ralph Castain
c1bbd7bc78
Sync to PMIx tarball - includes:
...
* update to configury to silence ident messages (thanks Gilles!)
* fix for warnings Jeff saw when get didn't find the requested data
* fix for Mac OSX operations
2015-08-31 21:51:02 -07:00
Ralph Castain
ef69958e01
Only copy the value across if the "get" operation succeeded
2015-08-31 17:11:26 -07:00
Ralph Castain
a3842af709
Sync to PMIx tarball
2015-08-31 07:47:46 -07:00
Ralph Castain
bcabd1e282
Sync with PMIx tarball, bringing across the warning fixes pointed out by Gilles
2015-08-30 21:13:55 -07:00
Gilles Gouaillardet
7e6a213465
pmix: fix compilation error
...
compilation failed because of missing prototypes when configure'd with --enable-debug --enable-picky on a CentOS 7 box
2015-08-31 10:33:13 +09:00