Ralph Castain
1d44f0c0e2
Silence Coverity warnings
2016-08-11 21:22:01 -07:00
Ralph Castain
73544d2e00
Rename symbol
2016-08-11 13:06:46 -07:00
Ralph Castain
b0cc9b0bc8
Update to latest PMIx toolext branch
...
Fix indentations
Update the ext20 component to match latest PMIx master.
Cleanup name conflicts and uninit vars
2016-08-11 12:29:48 -07:00
Ralph Castain
527b5c692a
Update to include extended tool support, new datatypes
2016-08-08 13:39:46 -07:00
Artem Polyakov
b24ec3e3b9
pmix/s2: fix indentation (only)
2016-08-06 16:31:19 +06:00
Artem Polyakov
2cb923a413
pmix/s1: fix indentation (only)
2016-08-06 16:30:45 +06:00
Artem Polyakov
8aa3ef7799
pmix/s2: fix s2 component data placement
...
Use wildcard for the information related to the job-level data.
Fixes s2 component with regard to PR https://github.com/open-mpi/ompi/pull/1897 .
2016-08-06 15:49:16 +06:00
Artem Polyakov
81063f1717
pmix/s1: fix s1 component data placement
...
Use wildcard for the information related to the job-level data.
Fixes s1 component with regard to PR https://github.com/open-mpi/ompi/pull/1897 .
2016-08-06 15:45:46 +06:00
Gilles Gouaillardet
30f98cd9d0
pmix: redefine OPAL_PMIX_ARCH macro
...
Architecture is set by the ompi layer *after* job startup, so the key cannot
have the "pmix" prefix since optimizations in open-mpi/ompi@01a653d50a
otherwise architecture cannot be retrieved
2016-08-04 13:31:28 +09:00
Gilles Gouaillardet
21e7f31dbe
pmix2x: fix unpack sequence in PMIx_Get callback
...
first unpack the nspace (PMIX_STRING) before unpacking the various keys (PMIX_KVAL)
2016-08-01 14:21:22 +09:00
Ralph Castain
16fccd4964
Establish a way for ORTE to tell PMIx the base tmpdir to use, and update PMIx to understand such directives
2016-07-29 09:52:36 -07:00
Ralph Castain
cacb582ecd
Support timeout values when performing connect/accept operations. Bump default timeout to 10 minutes so folks have time to start the partnering application
2016-07-28 14:09:06 -07:00
Howard Pritchard
b65bbe017f
pmix/cray: switch to using wildcards for some
...
items so that at least srun native launch on
cray works again.
More issues to fix when using alps.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-07-26 17:07:58 -05:00
Ralph Castain
71de03fc67
Cleanup the new naming requirements to ensure that info is correctly retrieved
...
Cleanup permissions
Restore singleton operations
2016-07-21 09:46:03 -07:00
Ralph Castain
2b55ee8118
Cleanup Coverity warnings
2016-07-20 20:31:58 -07:00
Ralph Castain
01a653d50a
Remove a debug print in comm_cid.c. Update PMIx2 to include the revised PMIx_Get logic for higher performance by reducing the number of hash table lookups. Fix a bug where requests for data from a proc in another nspace could hang, or result in "not found".
...
Remove stale file reference
Restore autogen pass thru pmix
Remove generated file
2016-07-20 00:58:19 -07:00
Nathan Hjelm
03bce91de8
pmix/pmix2x: add missing increment in loop
...
This commit fixes a bug in the pmix2x client code where a loop
variable is not correctly incremented. This was leading to hangs and
crashes when creating intercommunicators. Also fixed two double
increments in other loops.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-07-18 10:35:05 -06:00
Jeff Squyres
72f41d4490
pmix: replace all tabs with spaces
...
No code or logic changes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-07-17 15:08:33 -04:00
Jeff Squyres
1c32742c66
pmix_ext20: fix syntax error
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-07-17 15:04:12 -04:00
Ralph Castain
99f7096031
Fix permissions
2016-07-16 21:03:55 -07:00
Ralph Castain
d4071fbd1c
Fix dynamic operations by ensuring that we only fire the debugger release if the debugger is attached, and that the OPAL pmix key for directing events to non-default handlers matches the PMIx spelling
2016-07-16 13:20:41 -07:00
Ralph Castain
1ceb35ba5c
Fix singletons - do not include the PMIx tool URI in the environment provided to child processes
2016-07-13 17:33:34 -07:00
Ralph Castain
20a91c2baf
Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program.
...
Cleanup debug message
2016-07-13 15:28:33 -07:00
Artem Polyakov
72585a905f
opal/pmix: add blocking Fence to SLURM components.
...
Blocking fence is used in yalla del proc. Native pmix exposes this functionality.
We need to expose it for SLURM's s1/s2 components as well.
Also this commit fixes uninitialized `rc` in fencenb's of both
components.
2016-07-11 09:43:15 +03:00
Artem Polyakov
8e16f47492
Merge pull request #1688 from artpol84/fix_base64
...
Fix base64 implementation in pmix framework.
2016-07-07 10:47:50 +06:00
Gilles Gouaillardet
acda07472a
configury: revamp and re-ident sub configure.m4 after open-mpi/ompi@846360fd4c
2016-07-06 11:59:51 +09:00
Gilles Gouaillardet
846360fd4c
configury: correctly perform make distclean when {libevent,hwloc,pmix} are external components
...
Thanks Jeff for the guidance
Fixes open-mpi/ompi#1683
note:
in order to keep this commit easy to review, some AS_IF([...]) were replaced with
AS_IF([false], ...) or AS_IF_([true], ...)
these will be removed and re-idented in a subsequent commit
2016-07-06 11:57:24 +09:00
Ralph Castain
ee56d9dc1a
Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field
2016-07-05 14:59:50 -07:00
Ralph Castain
7e0af3f4f0
Update pmix2x to track upstream changes
2016-07-05 11:54:22 -07:00
Gilles Gouaillardet
267821f0dd
pmix2x/pmix: fix a typo in PMIx_tool_init()
...
and remove now useless local variable i
2016-07-05 13:47:50 +09:00
Gilles Gouaillardet
efce8cc734
pmix2x/pmix: add missing include files
...
pmix cannot be built on alpine linux because of some missing includes.
uid_t and gid_t are defined in unistd.h or sys/types.h, and unistd.h
is not indirectly pulled under alpine linux, so do it manually.
Thanks N.L.K Nguyen for the report
(back-ported from upstream pmix/master@c8d55350a9 )
2016-07-05 09:03:14 +09:00
Ralph Castain
c9ada8e095
Silence Coverity warnings
2016-07-03 20:45:08 -07:00
Ralph Castain
673f82e2b6
Update the PMIx listener to avoid leaking sockets into children, and better handle race condition errors
2016-07-03 08:23:33 -07:00
Ralph Castain
6e434d6785
Add support for PMIx tool connections and queries. Initially only support a request to list all known namespaces (jobids) from ORTE, but other folks will extend that support to include additional information
...
Update to match PMIx RFC
Fix configury to point to correct libevent and hwloc locations
2016-06-29 19:19:19 -07:00
Ralph Castain
08b1438f15
Add missing PMIx range value so OPAL and PMIx align again
2016-06-22 22:03:25 -07:00
Gilles Gouaillardet
bf133c401e
pmix2x: fix a typo in dereg_event_hdlr()
...
This bug has been fixed when open-mpi/ompi@dde69e1be2 was backported into upstream pmix in pmix/master@5e5577778c
but it was not fixed in open-mpi/ompi
2016-06-22 13:45:29 +09:00
Ralph Castain
441739b5a4
Cleanup a lagging message that generates an annoying (but seemingly harmless) warning
2016-06-20 12:23:27 -07:00
Ralph Castain
0ba02821e6
Add requested key and job-level info
2016-06-19 18:22:31 -07:00
Ralph Castain
0a29f5cb77
Sigh - missed two typos
2016-06-18 20:57:53 -07:00
Ralph Castain
dd38cf1fed
Fix typo
2016-06-18 20:56:43 -07:00
Ralph Castain
dde69e1be2
Cleanup CIDs 1362763, 1362762, 1362760, 1362759, 1362758, 1362757, 1362756, 1362755, 1362754. Unsure how to resolve 1362761.
...
Fixes #1792
2016-06-18 12:28:46 -07:00
Ralph Castain
044c561cba
Roll to latest PMIx master
2016-06-16 17:30:30 -07:00
Ralph Castain
5d330d5220
Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler.
...
Add PMIx 2.0
Remove PMIx 1.1.4
Cleanup copying of component
Add missing file
Touchup a typo in the Makefile.am
Update the pmix ext114 component
Minor cleanups and resync to master
Update to latest PMIx 2.x
Update to the PMIx event notification branch latest changes
2016-06-14 13:08:41 -07:00
Ralph Castain
d58da99dbc
Shift to memcpy to avoid Solaris issues
2016-06-09 12:07:17 -07:00
Ralph Castain
8fa935534b
Abstract the strnlen function for environments that do not have it (e.g., Solaris 10)
2016-06-08 10:12:43 -07:00
Gilles Gouaillardet
b707d138fe
pmix114/pmix1_client: fix misc memory leaks
...
Fixes CID 1325146-1325149
2016-06-06 09:33:35 +09:00
Ralph Castain
ecea1e3bb5
Update to 1.1.4rc3
2016-06-01 20:56:07 -07:00
Ralph Castain
12ecf972af
Split the pmix external component into one for the 1.1.4 release, and another for the upcoming 2.0 release. Clean up the configury so the components look for a series-specific function instead of running a program.
...
NOTE: the changes for the 2.0 series are not yet in the PMIx master.
2016-06-01 14:15:24 -07:00
Ralph Castain
55923eacd3
Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize)
...
Rename temp vars in .m4 to avoid conflict with Travis
2016-05-27 08:06:31 -07:00
Artem Polyakov
725eea2819
Fix base64 implementation in pmix framework.
...
In the commit 80f07b65f16e9538aca7fc5e124d2074e7e0b69e setting of '-' marker used
as the string termination sign was moved from base64 code:
from: 80f07b65f1 (diff-1b10896c267d2591dc2c08fd0542ab67L491)
to: 80f07b65f1 (diff-1b10896c267d2591dc2c08fd0542ab67R189)
However the decoding function wasn't fixed and still expects on extra
byte at the end of the encoded string which leads to data truncation
during extraction (was noticed on standalone code that was using base64
from OMPI).
2016-05-23 23:30:31 +06:00