1
1
Граф коммитов

26922 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
fe64144892 Merge pull request #3259 from rhc54/topic/launch
Update how we pass the node regex so we pass _all_ nodes, even those without daemons.
2017-04-03 20:40:48 -07:00
Ralph Castain
92c996487c Update how we pass the node regex so we pass _all_ nodes, even those without daemons. This allows the backend daemons to form a complete picture of the allocation. Include info on which nodes have daemons on them, and populate that info on the backend as well.
Set the daemons' state to "running" and mark them as "alive" by default when constructing the nidmap

Get the DVM running again

Fix direct modex by eliminating race condition caused by releasing data while sending it

Up the size limit before compressing

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-03 19:25:15 -07:00
Nathan Hjelm
fad0803920 osc/rdma: fix typo in atomic code
Fixes #3267

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-04-03 15:54:28 -06:00
Ralph Castain
9850832dbd Merge pull request #3273 from rhc54/topic/pmix2.0
Update to PMIx v2.0alpha
2017-04-03 11:07:08 -07:00
Aurélien Bouteiller
6ef6a3fb18 Fix the Fortran mpiext building system
Signed-off-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
2017-04-03 13:46:32 -04:00
Ralph Castain
2cc5fea8be Update to PMIx v2.0alpha
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-03 10:02:29 -07:00
Nathan Hjelm
533a8e6dae cma: restore --with-cma=no configure option
This support broke when we enabled CMA by default. Addreses the issue
raised by #3270.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-04-03 10:42:34 -06:00
Yossi
b3736701c4 Merge pull request #3239 from xinzhao3/topic/ucx-num-eps
Passing estimated_num_procs to UCX init in PML and SPML.
2017-04-03 18:31:27 +03:00
Ralph Castain
5f6ba81f11 Merge pull request #3263 from ggouaillardet/topic/hwloc1116
hwloc: update hwloc to 1.11.6
2017-03-31 08:03:04 -07:00
Gilles Gouaillardet
81062b7cd2 hwloc: update hwloc to 1.11.6
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-03-31 13:35:16 +09:00
Jeff Squyres
7e57075f0d Merge pull request #3248 from jsquyres/pr/remove-macosx-pkg-support
dist: remove OS X package script
2017-03-29 18:46:14 -04:00
Jeff Squyres
f0a8a0af51 dist: remove OS X package script
We stopped supporting this long ago.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-29 10:13:01 -04:00
Jeff Squyres
f980cba6b3 Merge pull request #3249 from rhc54/topic/abort
Use the correct callback data - the callback function was expecting a…
2017-03-28 20:54:39 -04:00
Jeff Squyres
c7c13253ea Merge pull request #3250 from jsquyres/pr/modulefile-in-srpm
openmpi.spec: also put the modulefile in /opt if install_in_opt==1
2017-03-28 20:46:26 -04:00
Kevin Buckley
9e23c5e3f6 openmpi.spec: also put the modulefile in /opt if install_in_opt==1
Thanks to Kevin Buckley for noticing the issue and supplying the
patch.

[skip ci]
bot:notest

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-28 20:45:09 -04:00
Ralph Castain
7dd34d0c9a Use the correct callback data - the callback function was expecting a bool*, not a pmix_ptl_sr_t*.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-28 17:21:47 -07:00
Xin Zhao
ee952fcccd Passing estimated_num_procs to UCX init in PML and SPML.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-27 20:36:52 +03:00
Ralph Castain
d782542c5c Merge pull request #3241 from rhc54/topic/cov
Silence coverity dead-code warning
2017-03-27 08:33:33 -07:00
Ralph Castain
71c9bc1f7e Merge pull request #3242 from jsquyres/pr/orterun-run-as-root-minor-tweaks
orte: minor tweaks to run-as-root message
2017-03-27 08:29:43 -07:00
Jeff Squyres
a333cf691a orte: minor tweaks to run-as-root message
Two updates:

1. Remove the "run as root" error message from orterun.c, because that
   functionality is now in orted_submit.c (although it is still
   required in orte-dvm.c -- so sync the message in orted_submit.c and
   orte-dvm.c to be identical).
2. Slightly tweak the text of the "run as root" error message to
   explicitly state that we (strongly) suggest running as a non-root
   user (and add a little whitespace).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-27 04:50:21 -07:00
Ralph Castain
583dbe954c Silence coverity dead-code warning
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-26 20:36:43 -07:00
Ralph Castain
b398d721d5 Merge pull request #3236 from rhc54/topic/craycleanups
Silence a flood of warnings when compiling with gcc on Cray
2017-03-24 13:33:46 -07:00
Ralph Castain
6ef079cdab Merge pull request #3235 from rhc54/topic/tcp
Stop segfault in BTL/TCP
2017-03-24 12:38:31 -07:00
Ralph Castain
ecc8000136 Silence a flood of warnings when compiling with gcc on Cray
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 13:37:11 -06:00
Ralph Castain
470452cba0 Correctly check the sa_family and cast the data correctly before passing it to inet_nop, and don't be quite as fancy with the pointer arithmetic as the combination was causing us to segfault every time this debug message was called.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 11:42:57 -07:00
Jeff Squyres
88a4a163ae Merge pull request #3229 from hjelmn/osc_pt2pt
osc/pt2pt: fix typo
2017-03-24 14:38:16 -04:00
Ralph Castain
0694d0bfbe Merge pull request #3234 from rhc54/topic/cov
Fix coverity issues
2017-03-24 08:56:45 -07:00
Ralph Castain
35f817911e Fix coverity issues
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 08:09:46 -07:00
Ralph Castain
c0bcd11bcf Fix permissions - no CI required
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-23 08:05:52 -07:00
Nathan Hjelm
c72fb30eb5 osc/pt2pt: fix typo
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2017-03-23 09:00:21 -06:00
Ralph Castain
b6d1e24ec9 Merge pull request #3227 from rhc54/topic/abort
If we lose connection to the server after initiating a send/recv in P…
2017-03-23 07:55:21 -07:00
Ralph Castain
55e4fba5f5 If we lose connection to the server after initiating a send/recv in PMIx (e.g., in PMIx_Abort), then we need to "resolve" all pending recvs to avoid hanging.
Fixes #3225

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-23 02:53:21 -07:00
Ralph Castain
ea84a53faa Merge pull request #3218 from rhc54/topic/pmix2
Update to include the PMIx 2.0 APIs for monitoring and job control.
2017-03-21 20:11:10 -07:00
Ralph Castain
d645557fa0 Update to include the PMIx 2.0 APIs for monitoring and job control. Include required integration, but leave the monitors off for now. Move the sensor framework out of ORTE as it is being absorbed into PMIx
Fix typo and silence warnings

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 17:47:08 -07:00
Ralph Castain
10d401b6ec Merge pull request #3217 from rhc54/topic/wdirs
Resolve a race condition for setting our working directory when fork/exec'ing application procs.
2017-03-21 17:39:54 -07:00
Ralph Castain
74fd2c30af Cleanup alps odls module
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 17:41:11 -06:00
Ralph Castain
09a7b0ffad Merge pull request #3219 from rhc54/topic/getout
Ensure we properly exit with error if we cannot map the job
2017-03-21 16:29:56 -07:00
Ralph Castain
f8e1e3bed3 Ensure we properly exit with error if we cannot map the job
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 15:15:32 -07:00
Ralph Castain
75684dc260 Resolve a race condition for setting our working directory when fork/exec'ing application procs. We have to ensure we do it after the fork occurs since we want to use multiple threads in the odls. Otherwise, the different threads are bouncing the entire process around.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 13:54:03 -07:00
Jeff Squyres
20bf0dd7c6 Merge pull request #3214 from hppritcha/travis_remove_osx
travis: remove os-x from OS test matrix
2017-03-21 13:48:28 -04:00
Howard Pritchard
f06a5d93ea travis: remove os-x from OS test matrix
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-21 10:36:30 -06:00
Joshua Ladd
f8d7324b4f Merge pull request #3168 from xinzhao3/topic/ucx-mt-support
Add multithreading support in PML UCX framework.
2017-03-20 15:50:50 -04:00
Xin Zhao
6a99c60fbd Add multithreading support in PML UCX framework.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-20 19:55:00 +02:00
Ralph Castain
40a0144e76 Merge pull request #3208 from rhc54/topic/s2
You cannot include both pmi.h and pmi2.h as they have conflicting defines in them.
2017-03-19 14:21:04 -07:00
Ralph Castain
4b6d220a83 You cannot include both pmi.h and pmi2.h as they have conflicting defines in them.
Thanks to Kilian Cavalotti for pointing it out

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-19 11:53:54 -07:00
Jeff Squyres
ce0e1cd32c Merge pull request #3201 from hppritcha/jjhursey-topic/timer-gettimeofday
Jjhursey topic/timer gettimeofday
2017-03-18 20:12:36 -04:00
Howard Pritchard
b9331527f5 timer: hack use of clock_gettime
better solution needed later
 workaround for #3003

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-18 15:08:59 -05:00
Jeff Squyres
19b2454d9d Merge pull request #3198 from jsquyres/pr/master/fix-hwloc-dist-autogen
hwloc: re-enable use of autogen.pl in a tarball
2017-03-17 14:44:54 -04:00
Jeff Squyres
b8dfd49e97 hwloc: re-enable use of autogen.pl in a tarball
Commit fec519a793 broke the ability to
run autogen.pl in a distribution tarball.  This commit restores that
ability by also distributing opal/mca/hwloc/autogen.options in the
tarball.

Skipping CI because CI does not test this functionality:

[skip ci]
bot:notest

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-17 11:41:17 -07:00
Pavel Shamis / Pasha
78cd5de5d6 Merge pull request #3191 from shamisp/topic/oshmem_wait_cleanup
OSHMEM: shmem_wait code cleanup
2017-03-17 13:35:10 -05:00