1
1
Граф коммитов

24272 Коммитов

Автор SHA1 Сообщение Дата
Artem Polyakov
fc17deca43 Fix NBC iBarrier for inter-communicators.
Remove send of the extra message. This bug hase triggered on
MPICH/coll/nbicbarrier test. In this test a series of communicators
are created.
This extre-message was reseived after original communicator was destroyed
and queued into non_existing_communicator_pending. When new completely
unrelated communicator with the same id as original was created this message
was pushed into the frags_cant_match queue and caused seq numbers skew and hang.
2015-12-12 13:27:31 +06:00
Gilles Gouaillardet
3a3b13ea12 coll/base: fix an integer overflow in ompi_coll_base_reduce_generic
Refs open-mpi/ompi#1198
2015-12-11 13:55:59 +09:00
Artem Polyakov
25077fc5d9 Fix MPI_Alltoall to support inter-communicators.
Remove excessive parameter check to avoid premature exit from the collective.

MPI standard says:
The type signature associated with sendcount, sendtype, at a process must be equal to
the type signature associated with recvcount, recvtype at any other process. This implies
that the amount of data sent must be equal to the amount of data received, pairwise between
every pair of processes.

In case of inter-communicator we have 2 group of processes and "left" group may call
MPI_Alltoall(NULL, 0, MPI_INT, buf, 10, MPI_INT, comm, ...);
and the right one:
MPI_Alltoall(buf,10,MPI_INT, NULL, 0, MPI_INT, comm, ...);

And it would be legal though one of the group will receive 0 bytes from others.

This was triggered by MPICH/coll test called icalltoall.
2015-12-11 08:50:34 +06:00
Jeff Squyres
e80f5681f3 orte_setup_java.m4: remove unused file 2015-12-10 11:59:57 -08:00
Mike Dubman
98bc8b08d6 Merge pull request #1203 from alinask/topic/ucx_fix_typo
PML UCX: fix typo (following 7becc54d).
2015-12-10 15:04:27 +02:00
Alina Sklarevich
3ffd8dcd20 PML UCX: fix typo (following 7becc54d). 2015-12-10 13:51:10 +02:00
rhc54
513092bc25 Merge pull request #1201 from rhc54/topic/essdflt
Don't be so prescriptive about the ess component to be used
2015-12-09 23:12:42 -08:00
Nathan Hjelm
219a7cde40 Merge pull request #1200 from hjelmn/var_group_fix
mca/base: remove erroneous check in var group register function
2015-12-09 22:18:56 -08:00
Ralph Castain
1db3db022a Don't be so prescriptive about the ess component to be used - we just need to protect against the proc incorrectly taking the singleton component, so rule that one out. Ensure that the other components understand that they are only for use by daemons. 2015-12-09 19:54:44 -08:00
Nathan Hjelm
772172a99b mca/base: remove erroneous check in var group register function
This commit removes a check that causes mca_base_group_register to
improperly create a new group instead of using an existing group
when the project and framework names are the same. This check was
originally intended to prevent forming groups with names like
ompi_ompi, opal_opal, etc but there is no reason why we shouldn't
allow that.

Fixes open-mpi/ompi#1155

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-12-09 19:48:39 -07:00
Nathan Hjelm
bf5b2bb74f Merge pull request #1199 from hjelmn/mlx5_attr_check
btl/openib: add check for IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG
2015-12-09 18:35:42 -08:00
Nathan Hjelm
f692576f1e btl/openib: add check for IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG
Mofed 2.2 does not have the IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG attribute
flag. Add a check to fix compilation for mofed 2.2. This commit only
fixes complilation with the older mofed. It will not allow an Open MPI
compiled with mofed 2.3 or newer to work on a machine with mofed 2.2.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-09 17:02:36 -07:00
Ryan Grant
e5ea2e3248 Merge pull request #1193 from tkordenbrock/topic/fix.btl.logical.endpoint.rank
btl-portals4: set endpoint rank even if endpoint already exists

--Needs to be pulled over to 2.0.0 still @tkordenbrock
2015-12-09 13:49:44 -08:00
Nathan Hjelm
645d4099d4 Merge pull request #1197 from artpol84/comm_create_fix
Fix ompi_comm_create when source communicator is inter-communicator.
2015-12-09 09:18:24 -08:00
Artem Polyakov
ee71e35a90 Fix ompi_comm_create when source communicator is inter-communicator.
This bug was triggered by probe-intercom and icm tests from MPICH suite.
2015-12-09 15:44:26 +02:00
Gilles Gouaillardet
3a62341b30 Merge pull request #1189 from ggouaillardet/topic/empty_ddt_fix
ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of…
2015-12-09 15:29:03 +09:00
Jeff Squyres
00c5dc9449 rml oob: C99-ification of structure member assignment 2015-12-08 17:05:16 -08:00
Howard Pritchard
c2ea018ce5 Merge pull request #1194 from hppritcha/topic/fix_cray_pmix_locality
pmix/cray: fix locality bug
2015-12-08 16:25:30 -07:00
Howard Pritchard
cb7c26ce96 plm/slurm: add support for cray native slurm
Cray has added plugins to slurm to support
the Cray programming env (alpslli, cray pmi, etc).
Some of the workarounds needed with plm/alps
to avoid issues with Cray PMI getting mixed up
with orte launch system are also required in
a cray native slurm environment.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-08 13:47:20 -06:00
Howard Pritchard
fecb326256 pmix/cray: fix locality bug
There was a bug with the way the cray pmix component
was setting the locality property for ranks on the
same node, etc.

Improve location/syntax of a comment block.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-08 11:13:48 -08:00
Todd Kordenbrock
2b7e983989 btl-portals4: set endpoint rank even if endpoint already exists
If btl-portals4 is configured to use logical mapping of ranks to
physical nodes, then the endpoint must have the rank field set.
This commit fixes a bug that caused the endpoint to have the
nid/pid instead of the rank if the endpoint already exists.
2015-12-08 12:29:00 -06:00
Nathan Hjelm
2ff16c6ba5 Merge pull request #1192 from hjelmn/mlx5_atomics_fix
mlx5: need to set comp_mask to get experimental verbs attributes
2015-12-08 11:20:37 -07:00
Nathan Hjelm
f317ba5262 Merge pull request #1163 from hjelmn/ompi_proc_threads
ompi/proc: make proc system always thread safe
2015-12-08 10:36:55 -07:00
Nathan Hjelm
c9382f23e9 mlx5: need to set comp_mask to get experimental verbs attributes
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-08 10:34:16 -07:00
Nathan Hjelm
b47a64f27d Merge pull request #1188 from artpol84/intercomm_split_fix
Yet one more fix to intercommunicator splitting logic.
2015-12-08 07:09:46 -07:00
Nathan Hjelm
dae3746d2f Merge pull request #1190 from kawashima-fj/pr/sm-win-test-fix
osc/sm: Fix a bug that `MPI_WIN_TEST` does not update `flag` to 0
2015-12-08 06:39:16 -07:00
KAWASHIMA Takahiro
9c7b6a4352 osc/sm: Fix a bug that MPI_WIN_TEST does not update flag to 0.
`MPI_WIN_TEST` must update the `flag` parameter to 0 when not all
origin processes called `MPI_WIN_COMPLETE`. But sm OSC doesn't.
If the caller initialize the `flag` argument to a non-0 value,
the caller will receive the non-0 `flag` value.
2015-12-08 19:23:21 +09:00
Gilles Gouaillardet
59a361b781 ompio: correctly handle zero f_cc_size in mca_io_ompio_simple_grouping 2015-12-08 17:00:11 +09:00
Gilles Gouaillardet
d43ad3fada ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of ompi_datatype_create_indexed_block is invoked with a zero count 2015-12-08 16:25:36 +09:00
Artem Polyakov
7690f4027a Yet one more fix to intercommunicator splitting logic.
Previous commit f2794740 reverts Nathans changes. However it turns out
that I was unable to trace his logic until I started investigation of
icsplit hang. Bug was triggered when splitting Intercom was giving a group
where on side of the communicator was empty (icsplit, intercom create #2).
in this case remote_size == 0 and there is no way to distinguish between
inter- and intra-communicator.
Conclusion: We do need to distinguish between intra- and inter-communicators.
So we should use ompi_mpi_group_null.group.
2015-12-08 08:43:08 +02:00
Nathan Hjelm
63d8feb31c Merge pull request #1187 from hjelmn/bsend_fix
pml/ob1: add missing ompi_request_wait_completion for buffered sends
2015-12-07 23:09:04 -07:00
Nathan Hjelm
f68c315188 pml/ob1: add missing ompi_request_wait_completion for buffered sends
This commit adds a call to ompi_request_wait_completion for buffered
sends. Without this line it is possible to get into a state where the
data is never sent.

Fixes open-mpi/ompi#1185

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-07 22:28:07 -07:00
Artem Polyakov
90b4148f9b Merge pull request #1184 from artpol84/intercomm_split_fix
Fix intercommunicator split (was triggered by MPICH/icsend test)
2015-12-08 08:46:38 +05:00
Nathan Hjelm
eb830b9501 ompi_proc_pack: correctly handle proc sentinels
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-07 17:27:38 -07:00
Howard Pritchard
9548b8a9e8 plm/alps: add wlm detect infrastructure
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-07 07:43:20 -08:00
Artem Polyakov
f2794740b3 Fix intercommunicator split (was triggered by MPICH/icsend test) 2015-12-07 15:41:29 +02:00
rhc54
04fc4060db Merge pull request #1143 from rhc54/topic/noompi
Fix "autogen.pl --no-ompi"
2015-12-07 00:40:16 -08:00
Gilles Gouaillardet
4d2c7f7de1 cuda: fix missing #include opal/util/argv.h 2015-12-07 14:10:32 +09:00
Ralph Castain
e33515c87c Fix "autogen.pl --no-ompi", which was broken due to inclusion of a conditional in the test/monitoring Makefile.am that is only defined if OMPI is built.
Per suggestion from @bosilca, comment out generation of the shared library

Use the patch from Gilles instead
2015-12-06 21:10:17 -08:00
Gilles Gouaillardet
bfe8e03d9d fcoll/two_phase: use ompi_mpi_abort instead of PMPI_Abort
Thanks Jeff for the review
2015-12-07 11:34:36 +09:00
Gilles Gouaillardet
ef03bc726c ompi: fix comment in ompi/mpi/c/Makefile.am
Thanks Jeff for the review
2015-12-07 11:34:01 +09:00
Gilles Gouaillardet
37c978f5e9 coll/libnbc: correctly handle changed types.
this fixes open-mpi/ompi@d816d1c194
thanks Jeff for the review
2015-12-07 10:13:43 +09:00
Gilles Gouaillardet
26b2ed1069 fortran: add missing MPI_xxx_DUP_FN bindings in use-mpi-tkr
- MPI_COMM_DUP_FN
- MPI_TYPE_DUP_FN
- MPI_WIN_DUP_FN
2015-12-07 09:10:48 +09:00
Howard Pritchard
d7b437ecaa Merge pull request #1157 from gpichot/adds-darwin-headers-for-java
Adds darwin headers directory for Darwin JDK
2015-12-06 14:49:53 -07:00
George Bosilca
3a9664ac9d Fix Coverity CIDs 1341584-1341589. 2015-12-06 14:06:36 -05:00
Gabriel Pichot
ff5af73676 Adds usage of /usr/libexec/java_home for OS X platforms 2015-12-05 11:51:51 +01:00
Ralph Castain
10db7ebfab Update the symbol-hiding script to capture a broader range of symbols 2015-12-04 21:05:57 -08:00
rhc54
1568d6b33c Merge pull request #1181 from rhc54/topic/tools
Provide a mechanism by which a tool can request async progress thread support for ORTE
2015-12-04 12:24:05 -08:00
Ralph Castain
8823069fe9 Provide a mechanism by which a tool can request async progress thread support for ORTE 2015-12-04 08:26:57 -08:00
Jeff Squyres
ad35a363fa Merge pull request #1179 from jsquyres/pr/mpi-testsome-man-page-update
Pr/mpi testsome man page update
2015-12-04 05:55:33 -05:00