rhc54
2228d2efc2
Merge pull request #1965 from rhc54/topic/pmixfix
...
Provide backward compatible keys so that the non-PMIx components in t…
2016-08-13 13:48:12 -07:00
Ralph Castain
be8424b691
Provide backward compatible keys so that the non-PMIx components in the opal/pmix framework don't have to adjust as we continue to work on finalizing the PMIx reference scheme. Activate and utilize the new PMIx show_help capability to provide more meaningful error output when the server cannot start.
...
Add a contrib script to cleanup permissions incorrectly modified due to things like smb mounts
dd
2016-08-13 12:13:04 -07:00
rhc54
d12e50b2d6
Merge pull request #1963 from rhc54/topic/pmixfix
...
Fix shared memory rendezvous
2016-08-13 09:59:14 -07:00
Ralph Castain
08a0644df5
Fix shared memory rendezvous
2016-08-13 08:14:50 -07:00
rhc54
ddde154d28
Merge pull request #1962 from rhc54/topic/notify
...
Ensure we properly convert pmix status to ORTE state before activatin…
2016-08-13 06:59:50 -07:00
Ralph Castain
48d35a9627
Ensure we properly convert pmix status to ORTE state before activating an error state upon notification. Cleanup some conversion issues on notification info. Add a new orte_notify.c test program
2016-08-12 21:14:29 -07:00
rhc54
9868093bef
Merge pull request #1961 from rhc54/topic/static
...
Setup the job list in the PMIx integration so that static ports can run
2016-08-12 15:17:31 -07:00
rhc54
9eed451916
Merge pull request #1960 from rhc54/topic/rsh
...
Restore the rsh template creation code
2016-08-12 13:38:43 -07:00
rhc54
8d67f753ca
Merge pull request #1959 from rhc54/topic/nodeid
...
The node index isn't normally passed with the packed node object, so …
2016-08-12 13:30:10 -07:00
Ralph Castain
4a4c9703a9
Setup the job list in the PMIx integration so that static ports can run
2016-08-12 13:27:10 -07:00
rhc54
1ef3c86d44
Merge pull request #1931 from hjelmn/ess_fix
...
ess/base: set up nidmap after pmix
2016-08-12 13:10:30 -07:00
Ralph Castain
5717b75b45
Restore the rsh template creation code
2016-08-12 12:43:40 -07:00
rhc54
ee1ee2086c
Merge pull request #1958 from rhc54/topic/path
...
Fix a bug where we were requiring that all paths in $PATH be absolute
2016-08-12 12:31:43 -07:00
Ralph Castain
d4327fd973
The node index isn't normally passed with the packed node object, so we need to set it on the remote end as the orted needs to pass it down to the procs. Refactor the registration code to better package proc-level info - we will separate out the node and app levels in a subsequent change.
2016-08-12 12:06:23 -07:00
Ralph Castain
0e58609327
Fix a bug where we were requiring that all paths in $PATH be absolute. Some users provide relative paths in their environment, and we should respect those.
2016-08-12 11:28:57 -07:00
rhc54
163999bce0
Merge pull request #1957 from rhc54/topic/rsh
...
If the ssh agent hasn't been given, then check for qrsh and friends
2016-08-12 11:18:28 -07:00
Ralph Castain
1c44543854
If the ssh agent hasn't been given, then check for qrsh and friends
2016-08-12 07:46:39 -07:00
rhc54
397faad46b
Merge pull request #1954 from rhc54/topic/covpmix
...
Silence Coverity warnings
2016-08-12 06:38:04 -07:00
Ralph Castain
1d44f0c0e2
Silence Coverity warnings
2016-08-11 21:22:01 -07:00
Nathan Hjelm
9444df1eb7
osc/pt2pt: make lock_all locking on-demand
...
The original lock_all algorithm in osc/pt2pt sent a lock message to
each peer in the communicator even if the peer is never the target of
an operation. Since this scales very poorly the implementation has
been replaced by one that locks the remote peer on first communication
after a call to MPI_Win_lock_all.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-11 15:33:07 -06:00
Nathan Hjelm
7589a25377
osc/pt2pt: do not repost receive from request callback
...
This commit fixes an issue that can occur if a target gets overwhelmed with
requests. This can cause osc/pt2pt to go into deep recursion with a stack
like req_complete_cb -> ompi_osc_pt2pt_callback -> start -> req_complete_cb
-> ... . At small scale this is fine as the recursion depth stays small but
at larger scale we can quickly exhaust the stack processing frag requests.
To fix the issue the request callback now simply puts the request on a
list and returns. The osc/pt2pt progress function then handles the
processing and reposting of the request.
As part of this change osc/pt2pt can now post multiple fragment receive
requests per window. This should help prevent a target from being overwhelmed.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-08-11 15:33:07 -06:00
rhc54
82240f579a
Merge pull request #1952 from rhc54/topic/pmixcov
...
Update to latest PMIx toolext branch
2016-08-11 14:24:13 -07:00
Ralph Castain
73544d2e00
Rename symbol
2016-08-11 13:06:46 -07:00
Ralph Castain
b0cc9b0bc8
Update to latest PMIx toolext branch
...
Fix indentations
Update the ext20 component to match latest PMIx master.
Cleanup name conflicts and uninit vars
2016-08-11 12:29:48 -07:00
George Bosilca
8d0baf140f
If the RTE fails to deliver the daemon information,
...
gracefully fallback to a non-reordered communicator.
Optimize the loops building the process hierarchy.
2016-08-11 13:04:27 -04:00
Howard Pritchard
e46eee3fcb
mtl/ofi: use mca param to set av type
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-08-10 16:10:17 -06:00
Ralph Castain
23886754f0
Trim the coverity build line to packages available on this machine
2016-08-10 13:55:55 -07:00
Ralph Castain
55551a4fb7
Complete debug of the nightly coverity submittal
2016-08-10 12:05:21 -07:00
Ralph Castain
375f04b277
Update the nightly builds to submit to coverity
2016-08-10 08:45:18 -07:00
Gilles Gouaillardet
dfbf2b7be4
opal/threads: add OPAL_THREAD_SUB_SIZE_T macro
...
-1 is not a valid size_t, so instead of OPAL_THREAD_ADD_SIZE_T(..., -1),
simply OPAL_THREAD_SUB_SIZE_T(..., 1) and keep picky compilers happy
2016-08-10 13:37:36 +09:00
Nathan Hjelm
799104f688
Merge pull request #1947 from hjelmn/perf
...
pml/ob1: be more selective when using rdma capable btls
2016-08-09 22:15:09 -06:00
Nathan Hjelm
4079eec974
pml/ob1: be more selective when using rdma capable btls
...
This commit updates the btl selection logic for the RDMA and RDMA
pipeline protocols to use a btl iff: 1) the btl is also used for eager
messages (high exclusivity), or 2) no other RDMA btl is available on
an endpoint and the pml_ob1_use_all_rdma MCA variable is true. This
fixes a performance regression with shared memory when an RDMA capable
network is available.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-09 20:54:42 -06:00
rhc54
60f789dca1
Merge pull request #1948 from rhc54/topic/pmixtool
...
Update to include extended tool support, new datatypes
2016-08-09 16:17:28 -07:00
Nathan Hjelm
19be439998
Merge pull request #1949 from hjelmn/ugni_fix
...
btl/ugni: fix another connection race
2016-08-09 08:32:40 -06:00
Nathan Hjelm
38f18eed22
Merge pull request #1941 from ggouaillardet/topic/memory_patcher_configury
...
configury: make memory/patcher symbol detection more robust
2016-08-09 07:06:38 -06:00
Gilles Gouaillardet
13009aa290
opal/alfg: have opal_random() wrapper always return a positive int
2016-08-09 17:12:30 +09:00
Gilles Gouaillardet
50966673a9
configury: fix sed expression in libtool's patch for NAG compiler
2016-08-09 11:02:46 +09:00
Gilles Gouaillardet
6f6b3ac68a
configury: standardize memory/patcher symbol detection and make it more robust
...
by default, Sun compilers optimize out the original test, and hence fail detecting a symbol is missing.
2016-08-09 09:35:52 +09:00
Nathan Hjelm
adb668209b
btl/ugni: fix another connection race
...
This commit fixes a race that can occur when two threads are in the
ugni progress function at the same time. This race occurs when one
thread calls GNI_PostDataProbeById then goes to sleep then another
thread calls GNI_PostDataProbeById then GNI_EpPostDataWaitById before
the other thread wakes up. If this happens the first thread will print
a warning on GNI_EpPostDataWaitById about no matching post.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-08 15:38:11 -06:00
Ralph Castain
527b5c692a
Update to include extended tool support, new datatypes
2016-08-08 13:39:46 -07:00
Ralph Castain
ba77d9beff
Remove forced debugs
2016-08-08 13:20:24 -07:00
Nathan Hjelm
2788083b98
Merge pull request #1936 from hjelmn/osc_pt2pt_fix
...
osc/pt2pt: do not set rdma_frag after start
2016-08-08 14:17:40 -06:00
Nathan Hjelm
e4d7ea75a9
Merge pull request #1935 from hjelmn/persistent_fix
...
pml/ob1: reset req_bytes_packed on start
2016-08-08 14:17:13 -06:00
Todd Kordenbrock
b90da992c8
Merge pull request #1895 from PDeveze/Patchs-on-btl-portals4
...
btl/portals4: Take into account the limitation of portals4 (max_msg_s…
2016-08-08 15:12:50 -05:00
Todd Kordenbrock
3be6052523
Merge pull request #1896 from PDeveze/Patchs-on-coll-portals4
...
Patchs on coll portals4
2016-08-08 14:57:02 -05:00
Nathan Hjelm
5ced037488
Merge pull request #1939 from hjelmn/ugni_fix
...
btl/ugni: protect against re-entry and races in connections
2016-08-08 08:55:30 -06:00
Edgar Gabriel
fb9fa4fbc4
Merge pull request #1938 from edgargabriel/pr/barrier-on-close
...
io/ompio: Add barrier to file_close and to file_set_size
2016-08-08 09:22:08 -05:00
Edgar Gabriel
4709f4229b
Merge pull request #1929 from edgargabriel/pr/ompio-code-reorg
...
io/ompio: next step in code-reorganization
2016-08-08 09:20:54 -05:00
Artem Polyakov
31cf46a827
Merge pull request #1945 from artpol84/pmi_fixes
...
pmix-related fixes
2016-08-08 21:19:18 +07:00
Artem Polyakov
b24ec3e3b9
pmix/s2: fix indentation (only)
2016-08-06 16:31:19 +06:00