1
1
Граф коммитов

27402 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
c27e277832 Merge pull request #3745 from hjelmn/s390_asm_detect
configure: add builtin asm check for s390/s390x
2017-06-23 09:35:52 -06:00
Nathan Hjelm
b638612a98 Merge pull request #3744 from hjelmn/ompi_cov
opal: fix coverity issues
2017-06-23 09:09:42 -06:00
Nathan Hjelm
52d44afb74 Merge pull request #3743 from hjelmn/abstration_fix
opal/info: fix abstraction break
2017-06-23 08:59:58 -06:00
Ralph Castain
168e50bc13 Also need to avoid calling destruct on the opal_process_info struct after finalize
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-23 07:49:14 -07:00
Ralph Castain
e2160d1949 Merge pull request #3741 from rhc54/topic/info
Remove stale field
2017-06-23 07:31:50 -07:00
Nathan Hjelm
bc54c99e12 configure: add builtin asm check for s390/s390x
We accepted a change that enabled CMA on s390 and s390x. This change
had the side-effect that we were no longer using the builtin atomics
for these systems. This is a problem since we do not have ASM for
s390 and s390x. This commit restores the atomics.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:27:48 -06:00
Nathan Hjelm
db973437e1 opal: fix coverity issues
Fixes coverity CIDs 1412984, and 1412983.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:15:34 -06:00
Nathan Hjelm
9c621ad5a4 opal/info: fix abstraction break
The new info infrastructure introduced an abstration break by
including mpi.h and using MPI_ constants in opal. This commit fixes
the break by changing the constants to their opal equivalents.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:03:01 -06:00
Ralph Castain
0c258c32e8 Merge pull request #3737 from rhc54/topic/fixes
Fix pmix.query support
2017-06-23 06:25:48 -07:00
Ralph Castain
3af9344764 Remove stale field
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-23 06:22:31 -07:00
Ralph Castain
6ec2ad5288 Fix the pmix_query API when it asks for something that returns an array of pmix_info_t. Protect the PMIX_INFO_FREE macro from NULL arrays. Update the mpi_memprobe scaling test
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-22 20:11:36 -07:00
Nathan Hjelm
d8938ca0cc Merge pull request #3734 from hjelmn/osc_rdma_fix
osc/rdma: cleanup local peer setup and fix a bug
2017-06-22 14:31:05 -06:00
Nathan Hjelm
31ab83362a osc/rdma: cleanup local peer setup and fix a bug
The data endpoint was not being set correctly for local peers in some
cases. This commit fixes the bug and cleans the associated code to
simplify the logic.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-22 13:28:45 -06:00
Nathan Hjelm
4252258338 Merge pull request #3721 from hjelmn/list_cleanup
opal: use opal_list_t convienience macros
2017-06-22 09:12:23 -06:00
Ralph Castain
5f41a6da2b Merge pull request #3730 from rhc54/topic/cov
Silence Coverity warnings
2017-06-21 15:41:09 -07:00
Ralph Castain
3e78f84093 Silence Coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-21 13:19:51 -07:00
Ralph Castain
0f54799003 Merge pull request #3725 from rhc54/topic/cleanup
Ensure we properly cleanup on termination, including when terminating due to ctrl-c
2017-06-21 12:19:01 -07:00
Ralph Castain
38636f4f0a Ensure we properly cleanup on termination, including when terminating due to ctrl-c
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-21 06:33:37 -07:00
Ralph Castain
c38866eb39 Merge pull request #3724 from rhc54/topic/clean
Update orte-clean so it cleans legacy session directories as well as pmix artifacts
2017-06-20 19:25:19 -07:00
Ralph Castain
2aa286c9d0 Update orte-clean so it cleans legacy session directories as well as pmix artifacts
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 17:46:39 -07:00
Ralph Castain
d8b96983c2 Merge pull request #3722 from rhc54/topic/ext2x
Update the ext2x component to match the internal one
2017-06-20 12:40:05 -07:00
Ralph Castain
cba127bc43 Update the ext2x component to match the internal one
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 11:42:14 -07:00
Nathan Hjelm
ffd8ee2dfd opal: use opal_list_t convienience macros
This commit cleans up code in opal to use OPAL_LIST_FOREACH(_SAFE),
OPAL_LIST_DESTRUCT, and OPAL_LIST_RELEASE.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-20 12:37:12 -06:00
Ralph Castain
501ba8faad Merge pull request #3704 from rhc54/topic/signal
Control distribution of signals to children vs grandchildren
2017-06-20 11:11:43 -07:00
Ralph Castain
814e858082 Merge pull request #3696 from rhc54/topic/pmix3
Update PMIx and integration glue
2017-06-20 10:38:29 -07:00
Ralph Castain
952726c121 Update to latest PMIx master - equivalent to 2.0rc2. Update the thread support in the opal/pmix framework to protect the framework-level structures.
This now passes the loop test, and so we believe it resolves the random hangs in finalize.

Changes in PMIx master that are included here:

* Fixed a bug in the PMIx_Get logic
* Fixed self-notification procedure
* Made pmix_output functions thread safe
* Fixed a number of thread safety issues
* Updated configury to use 'uname -n' when hostname is unavailable

Work on cleaning up the event handler thread safety problem
Rarely used functions, but protect them anyway
Fix the last part of the intercomm problem
Ensure we don't cover any PMIx calls with the framework-level lock.
Protect against NULL argv comm_spawn

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 09:02:15 -07:00
George Bosilca
1f291c8728
Add the fragment to the unexpected frags only after extracting the
pml_proc.
2017-06-20 16:03:52 +02:00
Gilles Gouaillardet
9ba85b85e1 coll/libnbc: revisit NBC_Handle usage
make NBC_Handle (almost) an internal structure created
by NBC_Schedule_request()
use a local variable instead of what was previously handle->tmpbuf

Refs open-mpi/ompi#3487

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-20 17:24:16 +09:00
Gilles Gouaillardet
68ac95003f coll/base: fix zero size datatype handling in mca_coll_base_alltoallv_intra_basic_inplace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-20 14:36:35 +09:00
KAWASHIMA Takahiro
dc24f3f1a8 Merge pull request #3716 from kawashima-fj/pr/delay-message
opal/util: Get rid of `\0` from abort delay message
2017-06-20 10:51:26 +09:00
Nathan Hjelm
0c8c7e50d0 Merge pull request #3682 from hjelmn/comm_assertions
ompi: add support for new communicator info assertions
2017-06-19 09:49:59 -06:00
KAWASHIMA Takahiro
3afc61644d opal/util: Get rid of \0 from abort delay message
My recent commit 6b91edd had this bug.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-19 20:08:34 +09:00
Edgar Gabriel
70107b3e52 Merge pull request #3703 from edgargabriel/pr/cart-comm-file-open-fix
Pr/cart comm file open fix
2017-06-15 15:03:38 -05:00
Ralph Castain
206aec6083 By default, apply signals to all direct children _and_ any children they might have spawned (so long as they remain in the same process group). Provide an MCA param (odls_base_signal_direct_children_only) to indicate that the signal is to go _only_ to our direct children, and not be delivered to any children spawned by those procs.
Refs https://www.mail-archive.com/users@lists.open-mpi.org/msg31221.html

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-15 12:26:11 -07:00
Edgar Gabriel
3b0b8fa12c io/ompio: update cartesian based grouping strategy
update the cartesian communicator based grouping strategy to match the other
algorithms used in the aggregator selection process.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-06-15 14:05:54 -05:00
Edgar Gabriel
bd6b430798 common/ompio: remove function call to cart_based_grouping
the cart_based_grouping aggregator strategy was not correctly updated
during the last major rewrite of the aggregator selection algorithm.
It is also not supposed to be called from file_open (but from
file_set_view).

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-06-15 14:04:03 -05:00
Ralph Castain
a7741ab120 Merge pull request #3702 from rhc54/topic/rf
Fix rank-file mapper launch by correctly setting up the remote map from the provided data
2017-06-15 10:06:50 -07:00
Ralph Castain
8f09929469 Fix rank-file mapper launch by correctly setting up the remote map from the provided data
Put a simple protection for the case where procs fail while we are trying to deregister handlers

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-15 08:33:29 -07:00
Ralph Castain
7d07659367 Merge pull request #3699 from rhc54/topic/bound
Only set the "bound" flag if we wre actually bound
2017-06-14 16:47:02 -07:00
Ralph Castain
8afa1433b8 Only set the "bound" flag if we wre actually bound
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-14 13:22:01 -07:00
George Bosilca
e9d533e62e
Fix warnings from non-debug mode.
Thanks Ralph for the report.
2017-06-13 16:57:42 -04:00
Gilles Gouaillardet
cb473ee00c Merge pull request #3691 from ggouaillardet/topic/hostname_or_uname
configury: use 'uname -n' when 'hostname' is not available
2017-06-12 16:25:15 +09:00
Gilles Gouaillardet
72c7329462 configury: use 'uname -n' when 'hostname' is not available
the 'hostname' command might not be available on some platforms
such as Fedora Core 26, so mimick config/libtool.m4 and fallback
to 'uname -n' if needed

Refs. #3680

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-12 15:04:32 +09:00
KAWASHIMA Takahiro
b5b6b22848 Merge pull request #3678 from kawashima-fj/pr/signal-abort-delay
Apply `opal_abort_delay` to the OPAL signal handler
2017-06-12 10:35:11 +09:00
Josh Hursey
c4971cf267 Merge pull request #3688 from jjhursey/fix/romio-314-missing-ops
ROMIO 3.1.4 : Add support for missing ops
2017-06-09 16:59:25 -05:00
Joshua Hursey
80a91dc244 io/romio314: Add work around support for missing MPI_File ops
* Add work around support for the following missing ops in ROMIO 3.1.4
    - `MPI_File_iread_at_all`
    - `MPI_File_iwrite_at_all`
    - `MPI_File_iread_all`
    - `MPI_File_iwrite_all`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-09 14:42:59 -05:00
Joshua Hursey
29609631a2 mpi/c: Protect some IO functions not widely implemented
* Protects us from segv when ROMIO 314 is selected and one of the
   following operations is called:
   - MPI_File_iread_at_all
   - MPI_File_iwrite_at_all
   - MPI_File_iread_all
   - MPI_File_iwrite_all

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-09 11:42:26 -05:00
Ralph Castain
2ffaf5ea9f Merge pull request #3686 from rhc54/topic/slurm
Print a better error message when srun isn't found in the path.
2017-06-09 09:27:20 -07:00
Ralph Castain
6ef87c8b83 Merge pull request #3687 from rhc54/topic/ext1x
Forward-port changes proposed for v3.0 to master
2017-06-09 09:27:02 -07:00
Ralph Castain
548cd24e4e Forward-port changes proposed for v3.0 to master from PR #3677
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-09 07:51:21 -07:00