1
1
Граф коммитов

27876 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
05f98d66a6 Merge pull request #4396 from rhc54/topic/pmixconfig
Alter the PMIx embedded configuration
2017-10-25 10:10:50 -05:00
Gilles Gouaillardet
c4650b5904 Merge pull request #4383 from ggouaillardet/topic/configury_ucx
configury: revamp ucx detection
2017-10-25 15:33:38 +09:00
Ralph Castain
8fbfe68754 Alter the PMIx embedded configuration so that we can build static with devel headers - if the builder requests that we install a separate libpmix, then don't prefix the PMIx variables.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 21:45:27 -07:00
Ralph Castain
cf3bc4f55b Merge pull request #4346 from matcabral/psm2_mtl_mq_thread_fix
MTL PSM2: add a thread lock while peeking and completing the psm2 requests.
2017-10-24 16:41:29 -05:00
Ralph Castain
14a0701949 Merge pull request #4391 from rhc54/topic/scale
Add timeout option to scaling script
2017-10-24 14:34:48 -05:00
Ralph Castain
e7c6718d29 Add timeout option to scaling script
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 12:33:22 -07:00
Ralph Castain
987aac1268 Merge pull request #4387 from rhc54/topic/dmodx
We should never block when requesting dmodex data from the PMIx server
2017-10-24 10:46:04 -05:00
Ralph Castain
292983261a We should never block when requesting dmodex data from the PMIx server as this will block it from being able to accept connections from local clients. Do not deregister standing dmodx requests when a fence completes unless we actually collected the data in the fence
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-24 07:51:10 -07:00
Ralph Castain
70c455938b Merge pull request #4382 from rhc54/topic/scaling
Update the scaling script to avoid use of "system" command
2017-10-23 22:43:25 -05:00
Ralph Castain
0353be9704 Update MPI init to properly skip barriers
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 19:28:34 -07:00
Gilles Gouaillardet
af03f55aa8 configury: revamp ucx detection
- when --with-ucx=DIR is not set, try the default path and fallback to /opt/ucx
 - when --with-ucx-libdir is not set, try lib64 and then lib directories
 - do not handle --with-ucx-libdir (this is a user mistake, no need to over-complicate our logic)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-24 09:27:57 +09:00
Ralph Castain
3b71be4db4 Update the scaling script to avoid use of "system" command, thus ensuring that each command sees the same environment. Fix prun to pickup and propagate OMPI MCA params
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 16:27:41 -07:00
bosilca
ac348da13a Merge pull request #4374 from bosilca/topic/osx_syslog
Topic/osx syslog
2017-10-23 18:06:36 -04:00
Ralph Castain
0721d933fc Merge pull request #4376 from rhc54/topic/interlib
Update the interlib example to show an alternative method for model declaration
2017-10-23 14:53:13 -05:00
Ralph Castain
e33f319380 Update example to show tests of various APIs
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 12:02:54 -07:00
Ralph Castain
6ea3c8a0bd Update the interlib example to show an alternative method for model declaration. Add a missing range value to the OPAL layer. Make it easier to see OMPI model callbacks
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 11:27:42 -07:00
George Bosilca
8f32b345de
Address syslog issues on OSX 10.13 with gcc 7.x
gcc 7.[1,2] (at least) fails to correctly parse the OSX 10.13 sys/syslog.h
header. As a results we need to potect syslog support in OPAL, PMIX and
ORTE.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-10-23 14:02:10 -04:00
Ralph Castain
79aef9369e Merge pull request #4371 from rhc54/topic/xvr
Updates to support cross-version operations with OMPI v2.x
2017-10-22 11:41:18 -05:00
Ralph Castain
a63904d47f Updates to support cross-version operations with OMPI v2.x
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-22 08:38:33 -07:00
Matias Cabral
b81bcd4b0d MTL PSM2: add a thread lock while peeking and completing the psm2
requests.
Reviewed-by: Gopalakrishnan, Aravind <aravind.gopalakrishnan@intel.com>
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2017-10-20 14:46:48 -07:00
Edgar Gabriel
defe73984a Merge pull request #4362 from edgargabriel/topic/fbtl-locking-support
Add file locking support in posix fbtl
2017-10-19 23:21:17 -05:00
Ralph Castain
f374ba161c Merge pull request #4366 from rhc54/topic/notify
Fix event registration so OpenMP/MPI coordination sides can both get notification of model declarations
2017-10-19 21:05:57 -05:00
Ralph Castain
f8ce31f13c Fix event registration so OpenMP/MPI coordination sides can both get notification of model declarations
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-19 18:06:38 -07:00
Ralph Castain
4eee95358b Merge pull request #4365 from rhc54/topic/routed
Ensure we update the routing plan so that tree spawn works!
2017-10-19 17:27:47 -05:00
Ralph Castain
75d411f3ea Ensure we update the routing plan so that tree spawn works!
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-19 14:02:06 -07:00
Edgar Gabriel
be0de21e6f fs/ufs and fbtl/posix: cleanup lock management
This commit looks large, but its really mostly a cleanup step.
1. introduce proper error handling for the return values of fcntl and the fbtl_posix_lock function
2. rename a parameter to more accurately reflect what it does
3. introduce an mca parameter in the fs/ufs component that allows to control
   what the level of locking the user would like to enforce
4. move the initialization of the fs_block_size parameter from fs/ufs into the
   common/ompio component. An fs component might be allowed to overwrite this
   value, but none of the actual fs components do that.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 14:56:28 -05:00
Edgar Gabriel
e62f9d2e52 fs/ufs: ensure that the never-lock flag is set if not on NFS
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:40 -05:00
Edgar Gabriel
f66c55f77a fbtl/posix: fixes in the offset calculation and for aio operations
our own internal testsuite passes now correctly. More testing to follow.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:39 -05:00
Edgar Gabriel
a3c638bc38 fbtl/posix: add support for file locking for the non-blocking operations
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:38 -05:00
Edgar Gabriel
415e76514d fbtl/posix: make the code compile
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Edgar Gabriel
f5e158c869 fbtl/posix: first cut in adding locking support
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Yossi Itigin
689f1be9b7 Merge pull request #4350 from alex-mikheev/topic/oshmem_spml_selection_fix
OSHMEM: add ucx to the list of default spmls
2017-10-19 17:59:52 +03:00
Gilles Gouaillardet
9771c575f5 Merge pull request #4352 from edgargabriel/pr/sem_close_fix
sharedfp/sm: close the named semaphore
2017-10-19 17:04:43 +09:00
Edgar Gabriel
4d995bd4eb sharedfp/sm: close the named semaphore
in case a named semaphore is used, it is necessary to close the semaphore to remove
all sm segments. sem_unlink just removes the name references once all proceeses have closed
the sem.

Fixes issue: #4336

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>

sharedfp/sm: unlink only needs to be called by one process

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-18 10:37:30 -05:00
Aurelien Bouteiller
3ef23f41a3
Bugfix a crash when a comm cannot be initialized
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2017-10-18 11:32:37 -04:00
Howard Pritchard
3345122f86 Merge pull request #4357 from hppritcha/topic/pmix_cray_fence_fix
pmix/cray: define fence method for cray pmix
2017-10-18 07:52:11 -06:00
Alex Mikheev
7cb7af1685
OSHMEM: add ucx to the list of default spmls
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-10-18 10:41:00 +03:00
Howard Pritchard
e8bfd494e7 pmix/cray: define fence method for cray pmix
Turns out UCX PML calls opal_pmix.fence in its del procs
method without checking whether or not the fence method
for the pmix component was defined.  Rather than patch
UCX PML, actually define a fence method for the cray pmix.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-10-17 15:58:01 -06:00
Ralph Castain
a76a61b2c9 Merge pull request #4348 from ggouaillardet/topic/pmix_libdir
configury: enhance external PMIx detection - add the --with-pmix-libd…
2017-10-17 10:05:48 -05:00
Gilles Gouaillardet
db2f3643d7 configury: enhance external PMIx detection - add the --with-pmix-libdir=DIR option look for libpmix.* libs in DIR, DIR/lib64 and DIR - if --with-pmix=DIR is given, look for libpmix.* in DIR/lib64 and DIR/lib
Fixes open-mpi/ompi#4347

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-17 17:03:01 +09:00
Joshua Ladd
27eb401a84 Merge pull request #4344 from alinask/topic/oshmem_verbs_build_restore
OSHMEM/CONFIGURE: verbs component - restore the previous build behavior
2017-10-16 11:32:19 -04:00
Alina Sklarevich
c7f5d13550 OSHMEM/CONFIGURE: verbs component - restore the previous build behavior
In case where support was requested but not found, stop the build.

Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-10-16 11:53:02 +03:00
Ralph Castain
6d7a780016 Merge pull request #4341 from rhc54/topic/foo
Ensure that the pmix server system-level rendezvous file is only output by the HNP
2017-10-14 13:17:42 -05:00
Ralph Castain
6ffb0d0507 Ensure that the pmix server system-level rendezvous file is only output by the HNP as (at least for slurm on cray) a daemon could be colocated with the HNP and overwrite the file. Update the scaling.pl script to only use the system-level rendezvous so it doesn't get rejected by a colocated daemon
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-14 10:16:49 -07:00
Ralph Castain
b75ed83d4b Merge pull request #4340 from rhc54/topic/update
Sync to PMIx v3. Ensure prun uses the ess/tool component.
2017-10-14 11:51:05 -05:00
Ralph Castain
60b338e857 Sync to PMIx v3. Ensure prun uses the ess/tool component.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-14 08:24:57 -07:00
Ralph Castain
ba4ec735e0 Merge pull request #4339 from rhc54/topic/s2
Ensure we exit with an appropriate error code when hitting a PMI2 error
2017-10-13 22:46:52 -05:00
Ralph Castain
8ae10c9e1a Ensure we exit with an appropriate error code when hitting a PMI2 error
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-13 19:30:28 -07:00
Ralph Castain
e1d1c8d3b2 Merge pull request #4337 from rhc54/topic/scaling
Update the scaling.pl script
2017-10-13 20:25:42 -05:00
Ralph Castain
31bce4ba9c Update the scaling.pl script
* check that the command succeeds when pre-positioning the file to ensure there isn't an error somewhere in the execution

* properly define srun cmd line options

* terminate the orte-dvm only when it is actually in operation so prun doesn't generate spurious error messages

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-13 18:23:18 -07:00