Ralph Castain
05f98d66a6
Merge pull request #4396 from rhc54/topic/pmixconfig
...
Alter the PMIx embedded configuration
2017-10-25 10:10:50 -05:00
Gilles Gouaillardet
c4650b5904
Merge pull request #4383 from ggouaillardet/topic/configury_ucx
...
configury: revamp ucx detection
2017-10-25 15:33:38 +09:00
Ralph Castain
8fbfe68754
Alter the PMIx embedded configuration so that we can build static with devel headers - if the builder requests that we install a separate libpmix, then don't prefix the PMIx variables.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 21:45:27 -07:00
Ralph Castain
cf3bc4f55b
Merge pull request #4346 from matcabral/psm2_mtl_mq_thread_fix
...
MTL PSM2: add a thread lock while peeking and completing the psm2 requests.
2017-10-24 16:41:29 -05:00
Ralph Castain
14a0701949
Merge pull request #4391 from rhc54/topic/scale
...
Add timeout option to scaling script
2017-10-24 14:34:48 -05:00
Ralph Castain
e7c6718d29
Add timeout option to scaling script
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 12:33:22 -07:00
Ralph Castain
987aac1268
Merge pull request #4387 from rhc54/topic/dmodx
...
We should never block when requesting dmodex data from the PMIx server
2017-10-24 10:46:04 -05:00
Ralph Castain
292983261a
We should never block when requesting dmodex data from the PMIx server as this will block it from being able to accept connections from local clients. Do not deregister standing dmodx requests when a fence completes unless we actually collected the data in the fence
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 07:51:10 -07:00
Ralph Castain
70c455938b
Merge pull request #4382 from rhc54/topic/scaling
...
Update the scaling script to avoid use of "system" command
2017-10-23 22:43:25 -05:00
Ralph Castain
0353be9704
Update MPI init to properly skip barriers
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 19:28:34 -07:00
Gilles Gouaillardet
af03f55aa8
configury: revamp ucx detection
...
- when --with-ucx=DIR is not set, try the default path and fallback to /opt/ucx
- when --with-ucx-libdir is not set, try lib64 and then lib directories
- do not handle --with-ucx-libdir (this is a user mistake, no need to over-complicate our logic)
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-24 09:27:57 +09:00
Ralph Castain
3b71be4db4
Update the scaling script to avoid use of "system" command, thus ensuring that each command sees the same environment. Fix prun to pickup and propagate OMPI MCA params
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 16:27:41 -07:00
bosilca
ac348da13a
Merge pull request #4374 from bosilca/topic/osx_syslog
...
Topic/osx syslog
2017-10-23 18:06:36 -04:00
Ralph Castain
0721d933fc
Merge pull request #4376 from rhc54/topic/interlib
...
Update the interlib example to show an alternative method for model declaration
2017-10-23 14:53:13 -05:00
Ralph Castain
e33f319380
Update example to show tests of various APIs
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 12:02:54 -07:00
Ralph Castain
6ea3c8a0bd
Update the interlib example to show an alternative method for model declaration. Add a missing range value to the OPAL layer. Make it easier to see OMPI model callbacks
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 11:27:42 -07:00
George Bosilca
8f32b345de
Address syslog issues on OSX 10.13 with gcc 7.x
...
gcc 7.[1,2] (at least) fails to correctly parse the OSX 10.13 sys/syslog.h
header. As a results we need to potect syslog support in OPAL, PMIX and
ORTE.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-10-23 14:02:10 -04:00
Ralph Castain
79aef9369e
Merge pull request #4371 from rhc54/topic/xvr
...
Updates to support cross-version operations with OMPI v2.x
2017-10-22 11:41:18 -05:00
Ralph Castain
a63904d47f
Updates to support cross-version operations with OMPI v2.x
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-22 08:38:33 -07:00
Matias Cabral
b81bcd4b0d
MTL PSM2: add a thread lock while peeking and completing the psm2
...
requests.
Reviewed-by: Gopalakrishnan, Aravind <aravind.gopalakrishnan@intel.com>
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2017-10-20 14:46:48 -07:00
Edgar Gabriel
defe73984a
Merge pull request #4362 from edgargabriel/topic/fbtl-locking-support
...
Add file locking support in posix fbtl
2017-10-19 23:21:17 -05:00
Ralph Castain
f374ba161c
Merge pull request #4366 from rhc54/topic/notify
...
Fix event registration so OpenMP/MPI coordination sides can both get notification of model declarations
2017-10-19 21:05:57 -05:00
Ralph Castain
f8ce31f13c
Fix event registration so OpenMP/MPI coordination sides can both get notification of model declarations
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-19 18:06:38 -07:00
Ralph Castain
4eee95358b
Merge pull request #4365 from rhc54/topic/routed
...
Ensure we update the routing plan so that tree spawn works!
2017-10-19 17:27:47 -05:00
Ralph Castain
75d411f3ea
Ensure we update the routing plan so that tree spawn works!
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-19 14:02:06 -07:00
Edgar Gabriel
be0de21e6f
fs/ufs and fbtl/posix: cleanup lock management
...
This commit looks large, but its really mostly a cleanup step.
1. introduce proper error handling for the return values of fcntl and the fbtl_posix_lock function
2. rename a parameter to more accurately reflect what it does
3. introduce an mca parameter in the fs/ufs component that allows to control
what the level of locking the user would like to enforce
4. move the initialization of the fs_block_size parameter from fs/ufs into the
common/ompio component. An fs component might be allowed to overwrite this
value, but none of the actual fs components do that.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 14:56:28 -05:00
Edgar Gabriel
e62f9d2e52
fs/ufs: ensure that the never-lock flag is set if not on NFS
...
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:40 -05:00
Edgar Gabriel
f66c55f77a
fbtl/posix: fixes in the offset calculation and for aio operations
...
our own internal testsuite passes now correctly. More testing to follow.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:39 -05:00
Edgar Gabriel
a3c638bc38
fbtl/posix: add support for file locking for the non-blocking operations
...
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:38 -05:00
Edgar Gabriel
415e76514d
fbtl/posix: make the code compile
...
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Edgar Gabriel
f5e158c869
fbtl/posix: first cut in adding locking support
...
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Yossi Itigin
689f1be9b7
Merge pull request #4350 from alex-mikheev/topic/oshmem_spml_selection_fix
...
OSHMEM: add ucx to the list of default spmls
2017-10-19 17:59:52 +03:00
Gilles Gouaillardet
9771c575f5
Merge pull request #4352 from edgargabriel/pr/sem_close_fix
...
sharedfp/sm: close the named semaphore
2017-10-19 17:04:43 +09:00
Edgar Gabriel
4d995bd4eb
sharedfp/sm: close the named semaphore
...
in case a named semaphore is used, it is necessary to close the semaphore to remove
all sm segments. sem_unlink just removes the name references once all proceeses have closed
the sem.
Fixes issue: #4336
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
sharedfp/sm: unlink only needs to be called by one process
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-18 10:37:30 -05:00
Aurelien Bouteiller
3ef23f41a3
Bugfix a crash when a comm cannot be initialized
...
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2017-10-18 11:32:37 -04:00
Howard Pritchard
3345122f86
Merge pull request #4357 from hppritcha/topic/pmix_cray_fence_fix
...
pmix/cray: define fence method for cray pmix
2017-10-18 07:52:11 -06:00
Alex Mikheev
7cb7af1685
OSHMEM: add ucx to the list of default spmls
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-10-18 10:41:00 +03:00
Howard Pritchard
e8bfd494e7
pmix/cray: define fence method for cray pmix
...
Turns out UCX PML calls opal_pmix.fence in its del procs
method without checking whether or not the fence method
for the pmix component was defined. Rather than patch
UCX PML, actually define a fence method for the cray pmix.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-10-17 15:58:01 -06:00
Ralph Castain
a76a61b2c9
Merge pull request #4348 from ggouaillardet/topic/pmix_libdir
...
configury: enhance external PMIx detection - add the --with-pmix-libd…
2017-10-17 10:05:48 -05:00
Gilles Gouaillardet
db2f3643d7
configury: enhance external PMIx detection - add the --with-pmix-libdir=DIR option look for libpmix.* libs in DIR, DIR/lib64 and DIR - if --with-pmix=DIR is given, look for libpmix.* in DIR/lib64 and DIR/lib
...
Fixes open-mpi/ompi#4347
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-17 17:03:01 +09:00
Joshua Ladd
27eb401a84
Merge pull request #4344 from alinask/topic/oshmem_verbs_build_restore
...
OSHMEM/CONFIGURE: verbs component - restore the previous build behavior
2017-10-16 11:32:19 -04:00
Alina Sklarevich
c7f5d13550
OSHMEM/CONFIGURE: verbs component - restore the previous build behavior
...
In case where support was requested but not found, stop the build.
Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-10-16 11:53:02 +03:00
Ralph Castain
6d7a780016
Merge pull request #4341 from rhc54/topic/foo
...
Ensure that the pmix server system-level rendezvous file is only output by the HNP
2017-10-14 13:17:42 -05:00
Ralph Castain
6ffb0d0507
Ensure that the pmix server system-level rendezvous file is only output by the HNP as (at least for slurm on cray) a daemon could be colocated with the HNP and overwrite the file. Update the scaling.pl script to only use the system-level rendezvous so it doesn't get rejected by a colocated daemon
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-14 10:16:49 -07:00
Ralph Castain
b75ed83d4b
Merge pull request #4340 from rhc54/topic/update
...
Sync to PMIx v3. Ensure prun uses the ess/tool component.
2017-10-14 11:51:05 -05:00
Ralph Castain
60b338e857
Sync to PMIx v3. Ensure prun uses the ess/tool component.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-14 08:24:57 -07:00
Ralph Castain
ba4ec735e0
Merge pull request #4339 from rhc54/topic/s2
...
Ensure we exit with an appropriate error code when hitting a PMI2 error
2017-10-13 22:46:52 -05:00
Ralph Castain
8ae10c9e1a
Ensure we exit with an appropriate error code when hitting a PMI2 error
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-13 19:30:28 -07:00
Ralph Castain
e1d1c8d3b2
Merge pull request #4337 from rhc54/topic/scaling
...
Update the scaling.pl script
2017-10-13 20:25:42 -05:00
Ralph Castain
31bce4ba9c
Update the scaling.pl script
...
* check that the command succeeds when pre-positioning the file to ensure there isn't an error somewhere in the execution
* properly define srun cmd line options
* terminate the orte-dvm only when it is actually in operation so prun doesn't generate spurious error messages
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-13 18:23:18 -07:00