Ralph Castain
b6d1e24ec9
Merge pull request #3227 from rhc54/topic/abort
...
If we lose connection to the server after initiating a send/recv in P…
2017-03-23 07:55:21 -07:00
Ralph Castain
55e4fba5f5
If we lose connection to the server after initiating a send/recv in PMIx (e.g., in PMIx_Abort), then we need to "resolve" all pending recvs to avoid hanging.
...
Fixes #3225
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-23 02:53:21 -07:00
Ralph Castain
ea84a53faa
Merge pull request #3218 from rhc54/topic/pmix2
...
Update to include the PMIx 2.0 APIs for monitoring and job control.
2017-03-21 20:11:10 -07:00
Ralph Castain
d645557fa0
Update to include the PMIx 2.0 APIs for monitoring and job control. Include required integration, but leave the monitors off for now. Move the sensor framework out of ORTE as it is being absorbed into PMIx
...
Fix typo and silence warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 17:47:08 -07:00
Ralph Castain
10d401b6ec
Merge pull request #3217 from rhc54/topic/wdirs
...
Resolve a race condition for setting our working directory when fork/exec'ing application procs.
2017-03-21 17:39:54 -07:00
Ralph Castain
74fd2c30af
Cleanup alps odls module
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 17:41:11 -06:00
Ralph Castain
09a7b0ffad
Merge pull request #3219 from rhc54/topic/getout
...
Ensure we properly exit with error if we cannot map the job
2017-03-21 16:29:56 -07:00
Ralph Castain
f8e1e3bed3
Ensure we properly exit with error if we cannot map the job
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 15:15:32 -07:00
Ralph Castain
75684dc260
Resolve a race condition for setting our working directory when fork/exec'ing application procs. We have to ensure we do it after the fork occurs since we want to use multiple threads in the odls. Otherwise, the different threads are bouncing the entire process around.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 13:54:03 -07:00
Jeff Squyres
20bf0dd7c6
Merge pull request #3214 from hppritcha/travis_remove_osx
...
travis: remove os-x from OS test matrix
2017-03-21 13:48:28 -04:00
Howard Pritchard
f06a5d93ea
travis: remove os-x from OS test matrix
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-21 10:36:30 -06:00
Joshua Ladd
f8d7324b4f
Merge pull request #3168 from xinzhao3/topic/ucx-mt-support
...
Add multithreading support in PML UCX framework.
2017-03-20 15:50:50 -04:00
Xin Zhao
6a99c60fbd
Add multithreading support in PML UCX framework.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2017-03-20 19:55:00 +02:00
Ralph Castain
40a0144e76
Merge pull request #3208 from rhc54/topic/s2
...
You cannot include both pmi.h and pmi2.h as they have conflicting defines in them.
2017-03-19 14:21:04 -07:00
Ralph Castain
4b6d220a83
You cannot include both pmi.h and pmi2.h as they have conflicting defines in them.
...
Thanks to Kilian Cavalotti for pointing it out
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-19 11:53:54 -07:00
Jeff Squyres
ce0e1cd32c
Merge pull request #3201 from hppritcha/jjhursey-topic/timer-gettimeofday
...
Jjhursey topic/timer gettimeofday
2017-03-18 20:12:36 -04:00
Howard Pritchard
b9331527f5
timer: hack use of clock_gettime
...
better solution needed later
workaround for #3003
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-18 15:08:59 -05:00
Jeff Squyres
19b2454d9d
Merge pull request #3198 from jsquyres/pr/master/fix-hwloc-dist-autogen
...
hwloc: re-enable use of autogen.pl in a tarball
2017-03-17 14:44:54 -04:00
Jeff Squyres
b8dfd49e97
hwloc: re-enable use of autogen.pl in a tarball
...
Commit fec519a793
broke the ability to
run autogen.pl in a distribution tarball. This commit restores that
ability by also distributing opal/mca/hwloc/autogen.options in the
tarball.
Skipping CI because CI does not test this functionality:
[skip ci]
bot:notest
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-17 11:41:17 -07:00
Pavel Shamis / Pasha
78cd5de5d6
Merge pull request #3191 from shamisp/topic/oshmem_wait_cleanup
...
OSHMEM: shmem_wait code cleanup
2017-03-17 13:35:10 -05:00
Ralph Castain
afcc33862e
Merge pull request #3197 from rhc54/topic/errors
...
Provide a little more help on the error messages when an executable i…
2017-03-17 11:29:39 -07:00
Ralph Castain
dc85e7fde7
Provide a little more help on the error messages when an executable isn't found so we have some better idea where we were looking for it. Don't double-report such errors. Ensure the ORTE_ERROR_NAME doesn't get a NULL back for the string name of an error code as that might cause some systems to segfault
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-17 09:54:37 -07:00
Ralph Castain
45b46dc446
Merge pull request #3181 from artpol84/add_proc_fix_2/master
...
ompi: Avoid unnecessary PMIx lookups when adding procs (step 2).
2017-03-16 15:06:08 -07:00
Jeff Squyres
5219054d29
Merge pull request #3185 from jsquyres/pr/master/compiler-warning-squashes
...
Compiler warning squashes
2017-03-16 10:12:08 -04:00
Howard Pritchard
65d0372e84
Merge pull request #3179 from hppritcha/topic/remove_osx_builtin_atomics
...
OSx: remove built-in atomics support
2017-03-16 07:58:41 -06:00
Jeff Squyres
760db0d5ce
osc/pt2pt: fix compiler warning
...
Remove unused variable.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-16 05:46:11 -07:00
Jeff Squyres
1947280865
topo/treematch: squash some compiler warnings
...
Only define MIN/MAX if they are not already defined.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-16 05:44:26 -07:00
Jeff Squyres
b51c4e2797
memory/patcher: fix a compiler warning
...
Don't define the madvise intercept functions since we're not currently
intercepting madvise.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-16 05:43:51 -07:00
Joshua Hursey
48d13aa8ef
mpi/c: Force wtick/wtime to use gettimeofday
...
* See https://github.com/open-mpi/ompi/issues/3003 for a discussion about
this patch. Once we get a better version in place we can revert this
change.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-03-15 21:24:37 -05:00
Jeff Squyres
616f20c52c
timer/linux: rename component-specific functions
...
Several component-specific functions were named with a prefix of
"opal_timer_base", which was quite confusing. Rename them to have a
prefix "opal_timer_linux" to make it clear that they are here in this
component (and different than *actual* opal_timer_base symbols).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-15 21:03:13 -05:00
Jeff Squyres
290d4598df
timer/linux: remove global variable
...
This variable is only used in one file, so make it static.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-15 21:03:06 -05:00
Artem Polyakov
1f7a3a2d54
ompi: Avoid unnecessary PMIx lookups when adding procs (step 2).
...
Follow-up for 717f3fef62
.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-03-16 07:47:27 +07:00
Pavel Shamis (Pasha)
95c440683b
OSHMEM: shmem_wait code cleanup
...
* updating naming convention for the arguments in order to ensure
that the name aligns with an actual meaning of the argument
* remove local variable references in the macro
* adding volatile for the poll variables
Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
2017-03-15 21:53:44 +00:00
Howard Pritchard
db2e1298fb
OSx: remove built-in atomics support
...
It was decided to remove support for os-x builtin atomics
Fixes #2668
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-15 12:45:33 -06:00
Jeff Squyres
60ca372d60
NEWS: Sync with v2.0.x and v1.10 releases
...
Pull in content from v1.10 and v2.0.x branches.
[skip ci]
bot:notest
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-15 09:56:31 -07:00
Howard Pritchard
1709febdea
Merge pull request #3166 from hppritcha/topic/swat_state_orted_comp_warning
...
ORTED: swat another compiler warning
2017-03-15 08:40:59 -06:00
Howard Pritchard
1f7378d7e4
Merge pull request #3151 from hppritcha/topic/update_license_file
...
LICENSE: update according to copyrights in source files
2017-03-15 07:58:27 -06:00
Howard Pritchard
8e4689c2b8
v3.x:updates for branching v3.x
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-14 14:03:47 -06:00
Howard Pritchard
c2da14d514
AUTHORS: update for 3.0.0 branching
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-14 13:54:56 -06:00
Ralph Castain
96d7d10c1d
Merge pull request #3170 from rhc54/topic/reg
...
Ensure the backend daemons know if we are in a managed allocation and if the HNP was included in the allocation
2017-03-14 12:48:09 -07:00
Nathan Hjelm
db9232f8d6
Merge pull request #3169 from hjelmn/btl_ugni_2_0
...
More btl/ugni updates
2017-03-14 13:23:13 -06:00
Nathan Hjelm
37214eda09
Merge pull request #3164 from hjelmn/ob1_pinned
...
pml/ob1: do not cache leave_pinned
2017-03-14 13:22:18 -06:00
Mike Dubman
ccac7e5363
Merge pull request #3157 from vspetrov/c_coll_allgather_usage_bugfix
...
Fixes the coll_allgather usage bug
2017-03-14 18:42:04 +01:00
Ralph Castain
24e8639826
Platform file update
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-14 11:11:48 -06:00
Ralph Castain
955fa0456d
Merge pull request #3161 from rhc54/topic/cov2
...
Silence Coverity warnings
2017-03-14 10:10:11 -07:00
Ralph Castain
61a71e25ef
Ensure the backend daemons know if we are in a managed allocation and if
...
the HNP was included in the allocation
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-14 10:06:43 -07:00
Nathan Hjelm
6b210fa2c4
btl/ugni: do not return a frag from sendi if an endpoint is waitlisted
...
This fixes a hang that can occur when running bandwidth tests.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-14 10:14:13 -06:00
Nathan Hjelm
2e42b0afbd
btl/ugni: move connection check into sync event
...
This commit makes datagram checks time based and reduces their
frequency when only the wildcard datagram is posted. This change
improves latency on knl systems.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-14 10:10:05 -06:00
Nathan Hjelm
3e7ef48c13
pml/ob1: do not cache leave_pinned
...
This commit fixes a bug that disabled both the RDMA pipeline and RDMA
protocols in ob1. ob1 was internally caching the values of
opal_leave_pinned and opal_leave_pinned_pipeline at init time. This is
no longer valid as opal_leave_pinned may be set by any call to a btl's
add_procs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-14 09:00:40 -06:00
Howard Pritchard
5daaf7f3fd
ORTED: swat another compiler warning
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-03-14 08:41:51 -06:00