Brelle Emmanuel
e630046a4b
pml/ob1: fixed local handle sent during PUT control message
...
In case of using a btl_put in ob1, the handle of the locally registered
memory is sent with a PUT control message. In the current master code
the sent handle is necessary the handle in the frag but if the handle
has been successfully registered in the request, the frag structure does
not have any valid handle and all fragments use the request one.
I suggest to check if the handle in the fragment is valid and if not to
send the handle from the request.
Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>
2019-04-01 18:45:05 +02:00
KAWASHIMA Takahiro
76516bc70c
Merge pull request #6542 from kawashima-fj/pr/man-typo
...
man: Fix typo of MPI_TYPE_GET_NAME
2019-03-29 13:06:46 +09:00
KAWASHIMA Takahiro
63a1968459
man: Fix typo of MPI_TYPE_GET_NAME
...
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2019-03-29 13:01:52 +09:00
bosilca
b54fdf5dd9
Merge pull request #6541 from bwbarrett/bugfix/enotconn
...
btl/tcp: Skip printing error message in racy cleanup path
2019-03-28 22:42:52 -04:00
Brian Barrett
d5360711fa
btl/tcp: Skip printing error message in racy cleanup path
...
Avoid printing an error message about ENOTCONN return codes from
getpeername() when handling an incoming connection request. At
this point in the receive state machine, the remote process has
been verified to be a valid OMPI instance. In all-to-all startup
at 4k rank scale, we're seeing this error message when the remote
side drops the connection because it realizes it's the "loser"
in the connection race. We were already doing all the right things,
other than printing a scary error message. So skip the error
message and call it good.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2019-03-28 23:12:35 +00:00
Jeff Squyres
05c5e2034b
Merge pull request #6527 from James-A-Clark/master
...
Add compilation flag to allow unwinding through files that are present in the stack when attaching with MPIR
2019-03-28 18:16:02 -04:00
Jeff Squyres
3c1b33c93a
Merge pull request #6140 from bertwesarg/fix-cpp-condition
...
Fix use of bitwise operation in CPP condition
2019-03-28 10:06:20 -04:00
Nathan Hjelm
34d0790558
Merge pull request #6526 from ggouaillardet/topic/vader_fini
...
btl/vader: fix finalize sequence
2019-03-27 12:12:00 -06:00
James Clark
20f5840cbb
Add a compilation flag that adds unwind info to all files that are present in the stack starting from MPI_Init.
...
This is so when a debugger attaches using MPIR, it can step out of this stack back into main.
This cannot be done with certain aggressive optimisations and missing debug information.
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Jeff Squyres <jsquyres@cisco.com>
2019-03-27 14:32:15 +00:00
Gilles Gouaillardet
77060cad07
btl/vader: fix finalize sequence
...
free the component mpool in mca_btl_vader_component_close()
and after freeing soem objects that depend on it such as
mca_btl_vader_component.vader_frags_user
Thanks Christoph Niethammer for reporting this.
Refs. open-mpi/ompi#6524
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-03-27 11:57:40 +09:00
Ralph Castain
4e5cacc8db
Merge pull request #6523 from rhc54/topic/nid
...
Sync nidmap to PRRTE to fix hetero topo problem
2019-03-26 09:22:58 -07:00
Ralph Castain
8174286530
Sync nidmap to PRRTE to fix hetero topo problem
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-26 08:24:09 -07:00
Ralph Castain
dfbc14430d
Merge pull request #6440 from ggouaillardet/topic/yield_when_idle
...
schizo/ompi: correctly handle the yield_when_idle option
2019-03-25 12:17:34 -07:00
Geoff Paulsen
44b3aa244b
Merge pull request #6510 from sam6258/int4_cswap_fix
...
shmem/fortran: Fix invalid datatype size in call to atomic cswap
2019-03-25 11:49:00 -05:00
Gilles Gouaillardet
97b7fab872
Merge pull request #6516 from ggouaillardet/topic/pmix_refresh
...
pmix/pmix4x: refresh to the latest PMIx
2019-03-25 14:48:45 +09:00
Gilles Gouaillardet
e844f76725
pmix/pmix4x: refresh to the latest PMIx
...
refrest pmi4x to pmix/pmix@20cc9c041e
Fixes open-mpi/ompi#6513
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-03-25 13:33:18 +09:00
Artem Polyakov
bfff5783f9
Merge pull request #6371 from artpol84/osc/select_dbg
...
osc/base: Add debug output stating a selected component
2019-03-22 22:24:04 -07:00
Joshua Ladd
9ab6ecba65
Merge pull request #6492 from janjust/oshmem-multiple-contexts-master
...
Oshmem multiple contexts
2019-03-22 17:34:46 -04:00
Xin Zhao
9c3d00b144
ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
...
ompi/oshmem/spml/ucx: optimize spml ucx progress
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e0414006b0
ompi/oshmem/spml/ucx:delete oob path of getting rkeys in spml ucx
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e1c1ab0202
ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:37 +02:00
Scott Miller
6b294e0641
shmem/fortran: Fix invalid datatype size in call to atomic cswap
...
Signed-off-by: Scott Miller <scott.miller1@ibm.com>
2019-03-20 21:57:08 -04:00
Josh Hursey
53cd31ed7e
Merge pull request #6504 from jjhursey/rm-hash-pmix4
...
Do not force 'hash' gds on direct modex in pmix4x
2019-03-19 20:35:12 -05:00
Ralph Castain
4e0905cda7
Merge pull request #6505 from rhc54/topic/pmxup
...
Sync to latest PMIx master and silence hwloc warnings
2019-03-19 12:53:15 -07:00
Ralph Castain
0f26d8c76b
Silence warnings
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-19 10:27:39 -07:00
Ralph Castain
c4be211741
Sync to latest PMIx master
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-19 10:27:12 -07:00
Joshua Hursey
1314cf2640
Do not force 'hash' gds on direct modex in pmix4x
...
* Forcing the 'hash' gds component should not be necessary any more.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2019-03-19 11:53:26 -05:00
Josh Hursey
836c80c442
Merge pull request #6498 from jjhursey/rm-hash-pmix3
...
Do not force 'hash' gds on direct modex
2019-03-19 10:45:11 -05:00
Nathan Hjelm
bf5fb5b589
Merge pull request #6500 from nysal/spinlock_fix
...
opal/atomics: Add acquire semantics back for spinlocks
2019-03-19 07:54:37 -06:00
Jeff Squyres
5111dbd480
Merge pull request #6493 from rhc54/topic/order
...
Ensure that nodes are always used in order provided
2019-03-19 09:40:21 -04:00
Nysal Jan K.A
00f27a80fc
opal/atomics: Add acquire semantics back for spinlocks
...
This was introduced in commit 9d0b3fe9
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
2019-03-19 16:27:03 +05:30
Joshua Hursey
c2581d0e33
Do not force 'hash' gds on direct modex
...
* Forcing the 'hash' gds component should not be necessary any more.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2019-03-18 21:52:32 -05:00
Ralph Castain
5aa775c02e
Correctly set the byte_object size
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-18 14:29:37 -07:00
Ralph Castain
aed06e68b9
Protect against NULL node pointer
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-16 01:31:28 -07:00
Ralph Castain
2794ae43b3
Update nidmap
...
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-16 01:20:15 -07:00
Ralph Castain
35a597178d
Ensure that nodes are always used in order provided
...
If a user provides a list of nodes to use via -host or -hostfile, then
ensure that the ranks are placed according to that order. Also fix a bug
where the number of slots on a node was incorrectly computed for
localhost if the name given didn't exactly match the return from
get_hostname.
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-03-15 12:58:10 -07:00
Xin Zhao
48033ac1f4
ompi/oshmem: add spml_context back to sshmem_type in memheap, to keep track of ucx_ctx_default's rkeys
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:21 +02:00
Xin Zhao
9a06000962
ompi/oshmem/spml/ucx: let shmem_finalize to clean up any ctx left
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:07 +02:00
Xin Zhao
289595e45d
OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:50 +02:00
Xin Zhao
79ba752667
ompi/oshmem/spml/ucx: fix eps destroy in shmem_ctx_destroy().
...
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:38 +02:00
Xin Zhao
b00209e1f5
Revert "OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx."
...
This reverts commit f1b095c784de6d1908fa40dcf76e733110cbeaf2.
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:46:56 +02:00
Josh Hursey
ad8c842e7d
Merge pull request #6477 from markalle/report_bindings_strlen
...
opal_hwloc_base_cset2str() off-by-1 in its strncat()
2019-03-14 12:42:50 -05:00
Yossi Itigin
9b91cf09cc
Merge pull request #6481 from hoopoepg/topic/check-ucx-params
...
PML/SPML/UCX: added evaluation of mmap events
2019-03-14 11:53:42 +02:00
Sergey Oblomov
c319cf9ade
COMMON/UCX: rewording of hooks suggestion
...
- also updated output macro
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-14 11:00:57 +02:00
bosilca
0173796008
Merge pull request #6482 from awlauria/indexed_datatype_overflows
...
Fix integer overflows with indexed datatype creation.
2019-03-13 11:46:43 -04:00
Austen Lauria
b61e6242d3
Fix integer overflows with indexed datatype creation.
...
The types of count, disp, and extent passed into
ompi_datatype_add() should be size_t, ptrdiff_t and ptrdiff_t,
respectively. This prevents integer overflows and errors in
computing the size of large indexed datatypes.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2019-03-13 09:39:57 -04:00
Sergey Oblomov
d8e3562bae
PML/SPML/UCX: added evaluation of mmap events
...
- there was a set of UCX related issues reported which caused
by mmap API hooks conflicts. We added diagnostic of such
problems to simplify bug-resolving pipeline
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-12 21:14:27 +02:00
Geoff Paulsen
a14bb4bc89
Merge pull request #6471 from hppritcha/topic/issue_6470
...
ompi_info: report whether MPI1 compat is enabled
2019-03-11 21:11:55 -05:00
Howard Pritchard
61ccc65302
ompi_info: report MPI1 compat is disabled
...
MPI1 compat disabled beyond v4.0.x
Related to #6470
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2019-03-11 13:50:29 -06:00
Jeff Squyres
2c76696e7a
Merge pull request #6476 from jsquyres/pr/remove-mpi1-compat-cli-options
...
configury: Remove --enable-mpi1-compatibility
2019-03-11 15:46:06 -04:00