1
1
Граф коммитов

24238 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
b7ba301310 Merge pull request #1165 from hjelmn/add_procs_group
ompi/group: release ompi_proc_t's at group destruction
2015-12-14 13:53:42 -08:00
Nathan Hjelm
9d659465b7 Merge pull request #1210 from artpol84/icbarrier_fix
Fix NBC iBarrier for inter-communicators.
2015-12-14 13:52:38 -08:00
Nathan Hjelm
4b3dac5933 Merge pull request #1216 from artpol84/icgatherv_fix
Fix NBC iGatherv for inter-communicators.
2015-12-14 13:51:58 -08:00
Jeff Squyres
357ca4ffd2 travis: add config file for travis-ci.org 2015-12-14 13:05:59 -08:00
Matias Cabral
7cfd7d50b9 Merge pull request #1219 from matcabral/PSM2_tag_hashing
Support for PSM2 hashing lookup in message queue.
2015-12-14 12:01:55 -08:00
matcabral
9a1f9be146 A new internal feature in PSM2 will use hash tables to
accelerate message queue lookups if the lookups have
the proper tag&mask layout. OpenMPI should follow
PSM2's preferred tag&mask spec, so that PSM2 can provide
a performance benefit.
2015-12-14 10:13:39 -08:00
Todd Kordenbrock
7b97963669 btl-portals4: remove unnecessary PtlMDBind result check
When PtlMDBind was removed, the result check was left in which
causes intermittent failures depending on the junk value found in
the 'ret' variable.  The commit removes the result check.
2015-12-14 12:09:01 -06:00
Jeff Squyres
3e308f41f7 rmaps base help: update binding error messages
Due to user confusion, update the show-help messages displayed when
processor and/or memory binding fails.  Thanks to Dave Love
(@loveshack) for the initial suggestion.

Fixes open-mpi/ompi#1087
2015-12-14 13:02:41 -05:00
Igor Ivanov
36d3a7aa6c contrib: Add bash script to measure performance
This script is useful to measure times from launching ompi
application to different internal points. A user can easy add
it`s test basing on existing tests.
See readme information inside the script.
2015-12-14 17:42:19 +02:00
Artem Polyakov
2d0919dbdc Fix NBC iGatherv for inter-communicators.
We need to use remote size to form a schedule.
2015-12-14 12:19:10 +06:00
Jeff Squyres
7977fa3f0b pmix112 config.h.in: remove generated file 2015-12-13 06:46:55 -08:00
Jeff Squyres
65f5a26f76 monitoring_test.c: remove unused var
Silence compiler warning
2015-12-13 06:46:11 -08:00
rhc54
6b23c917e5 Merge pull request #1212 from rhc54/pmix112
Update the PMIx native component to release v1.1.2
2015-12-12 21:43:32 -08:00
Ralph Castain
03eb1a80bf Update the PMIx native component to release v1.1.1, with addition of one bug-fix commit beyond the official release
Rename the pmix1xx component to pmix111 so it reflects the actual release it includes

Resolve the problem of PMIx being passed a bogus --with-platform argument when configuring the PMIx tarball code. There is no reason we should be passing --with-platform arguments to any internal subdirectory, so just leave that out when constructing the opal_subdir_args variable.

Update the PMIx code and continue attempting to debug direct modex

Fix a problem in the ORTE PMIx server - there was an early intent to optimize the direct modex by fetching data for all procs from the target job on the remote node, instead of fetching the data one proc at a time. However, this was never completely implemented, and so we would hang if we had multiple overlapping requests for data from more than one proc on the node.

Update PMIx to v1.1.2
2015-12-12 18:46:38 -08:00
rhc54
de7b93d3fc Merge pull request #1211 from rhc54/topic/jsharpe
Port the changes from #782 to the master.
2015-12-12 13:32:29 -08:00
Ralph Castain
5e5adebf8e Port the changes from #782 to the master. Not everything applies here as the code in the 1.10 series is a little different. In addition, we asked for a few changes (e.g., using MPI_ERR_ARG instead of "13") that are incorporated here.
Thanks to @jsharpe for the PR
2015-12-12 12:40:34 -08:00
Artem Polyakov
fc17deca43 Fix NBC iBarrier for inter-communicators.
Remove send of the extra message. This bug hase triggered on
MPICH/coll/nbicbarrier test. In this test a series of communicators
are created.
This extre-message was reseived after original communicator was destroyed
and queued into non_existing_communicator_pending. When new completely
unrelated communicator with the same id as original was created this message
was pushed into the frags_cant_match queue and caused seq numbers skew and hang.
2015-12-12 13:27:31 +06:00
Gilles Gouaillardet
3a3b13ea12 coll/base: fix an integer overflow in ompi_coll_base_reduce_generic
Refs open-mpi/ompi#1198
2015-12-11 13:55:59 +09:00
Artem Polyakov
25077fc5d9 Fix MPI_Alltoall to support inter-communicators.
Remove excessive parameter check to avoid premature exit from the collective.

MPI standard says:
The type signature associated with sendcount, sendtype, at a process must be equal to
the type signature associated with recvcount, recvtype at any other process. This implies
that the amount of data sent must be equal to the amount of data received, pairwise between
every pair of processes.

In case of inter-communicator we have 2 group of processes and "left" group may call
MPI_Alltoall(NULL, 0, MPI_INT, buf, 10, MPI_INT, comm, ...);
and the right one:
MPI_Alltoall(buf,10,MPI_INT, NULL, 0, MPI_INT, comm, ...);

And it would be legal though one of the group will receive 0 bytes from others.

This was triggered by MPICH/coll test called icalltoall.
2015-12-11 08:50:34 +06:00
Jeff Squyres
e80f5681f3 orte_setup_java.m4: remove unused file 2015-12-10 11:59:57 -08:00
Mike Dubman
98bc8b08d6 Merge pull request #1203 from alinask/topic/ucx_fix_typo
PML UCX: fix typo (following 7becc54d).
2015-12-10 15:04:27 +02:00
Alina Sklarevich
3ffd8dcd20 PML UCX: fix typo (following 7becc54d). 2015-12-10 13:51:10 +02:00
rhc54
513092bc25 Merge pull request #1201 from rhc54/topic/essdflt
Don't be so prescriptive about the ess component to be used
2015-12-09 23:12:42 -08:00
Nathan Hjelm
219a7cde40 Merge pull request #1200 from hjelmn/var_group_fix
mca/base: remove erroneous check in var group register function
2015-12-09 22:18:56 -08:00
Ralph Castain
1db3db022a Don't be so prescriptive about the ess component to be used - we just need to protect against the proc incorrectly taking the singleton component, so rule that one out. Ensure that the other components understand that they are only for use by daemons. 2015-12-09 19:54:44 -08:00
Nathan Hjelm
772172a99b mca/base: remove erroneous check in var group register function
This commit removes a check that causes mca_base_group_register to
improperly create a new group instead of using an existing group
when the project and framework names are the same. This check was
originally intended to prevent forming groups with names like
ompi_ompi, opal_opal, etc but there is no reason why we shouldn't
allow that.

Fixes open-mpi/ompi#1155

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-12-09 19:48:39 -07:00
Nathan Hjelm
bf5b2bb74f Merge pull request #1199 from hjelmn/mlx5_attr_check
btl/openib: add check for IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG
2015-12-09 18:35:42 -08:00
Nathan Hjelm
f692576f1e btl/openib: add check for IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG
Mofed 2.2 does not have the IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG attribute
flag. Add a check to fix compilation for mofed 2.2. This commit only
fixes complilation with the older mofed. It will not allow an Open MPI
compiled with mofed 2.3 or newer to work on a machine with mofed 2.2.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-09 17:02:36 -07:00
Ryan Grant
e5ea2e3248 Merge pull request #1193 from tkordenbrock/topic/fix.btl.logical.endpoint.rank
btl-portals4: set endpoint rank even if endpoint already exists

--Needs to be pulled over to 2.0.0 still @tkordenbrock
2015-12-09 13:49:44 -08:00
Nathan Hjelm
645d4099d4 Merge pull request #1197 from artpol84/comm_create_fix
Fix ompi_comm_create when source communicator is inter-communicator.
2015-12-09 09:18:24 -08:00
Artem Polyakov
ee71e35a90 Fix ompi_comm_create when source communicator is inter-communicator.
This bug was triggered by probe-intercom and icm tests from MPICH suite.
2015-12-09 15:44:26 +02:00
Gilles Gouaillardet
3a62341b30 Merge pull request #1189 from ggouaillardet/topic/empty_ddt_fix
ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of…
2015-12-09 15:29:03 +09:00
Jeff Squyres
00c5dc9449 rml oob: C99-ification of structure member assignment 2015-12-08 17:05:16 -08:00
Howard Pritchard
c2ea018ce5 Merge pull request #1194 from hppritcha/topic/fix_cray_pmix_locality
pmix/cray: fix locality bug
2015-12-08 16:25:30 -07:00
Howard Pritchard
cb7c26ce96 plm/slurm: add support for cray native slurm
Cray has added plugins to slurm to support
the Cray programming env (alpslli, cray pmi, etc).
Some of the workarounds needed with plm/alps
to avoid issues with Cray PMI getting mixed up
with orte launch system are also required in
a cray native slurm environment.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-08 13:47:20 -06:00
Howard Pritchard
fecb326256 pmix/cray: fix locality bug
There was a bug with the way the cray pmix component
was setting the locality property for ranks on the
same node, etc.

Improve location/syntax of a comment block.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-08 11:13:48 -08:00
Todd Kordenbrock
2b7e983989 btl-portals4: set endpoint rank even if endpoint already exists
If btl-portals4 is configured to use logical mapping of ranks to
physical nodes, then the endpoint must have the rank field set.
This commit fixes a bug that caused the endpoint to have the
nid/pid instead of the rank if the endpoint already exists.
2015-12-08 12:29:00 -06:00
Nathan Hjelm
2ff16c6ba5 Merge pull request #1192 from hjelmn/mlx5_atomics_fix
mlx5: need to set comp_mask to get experimental verbs attributes
2015-12-08 11:20:37 -07:00
Nathan Hjelm
f317ba5262 Merge pull request #1163 from hjelmn/ompi_proc_threads
ompi/proc: make proc system always thread safe
2015-12-08 10:36:55 -07:00
Nathan Hjelm
c9382f23e9 mlx5: need to set comp_mask to get experimental verbs attributes
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-08 10:34:16 -07:00
Nathan Hjelm
b47a64f27d Merge pull request #1188 from artpol84/intercomm_split_fix
Yet one more fix to intercommunicator splitting logic.
2015-12-08 07:09:46 -07:00
Nathan Hjelm
dae3746d2f Merge pull request #1190 from kawashima-fj/pr/sm-win-test-fix
osc/sm: Fix a bug that `MPI_WIN_TEST` does not update `flag` to 0
2015-12-08 06:39:16 -07:00
KAWASHIMA Takahiro
9c7b6a4352 osc/sm: Fix a bug that MPI_WIN_TEST does not update flag to 0.
`MPI_WIN_TEST` must update the `flag` parameter to 0 when not all
origin processes called `MPI_WIN_COMPLETE`. But sm OSC doesn't.
If the caller initialize the `flag` argument to a non-0 value,
the caller will receive the non-0 `flag` value.
2015-12-08 19:23:21 +09:00
Gilles Gouaillardet
59a361b781 ompio: correctly handle zero f_cc_size in mca_io_ompio_simple_grouping 2015-12-08 17:00:11 +09:00
Gilles Gouaillardet
d43ad3fada ddt: duplicate MPI_DATATYPE_NULL when ompi_datatype_create_indexed of ompi_datatype_create_indexed_block is invoked with a zero count 2015-12-08 16:25:36 +09:00
Artem Polyakov
7690f4027a Yet one more fix to intercommunicator splitting logic.
Previous commit f2794740 reverts Nathans changes. However it turns out
that I was unable to trace his logic until I started investigation of
icsplit hang. Bug was triggered when splitting Intercom was giving a group
where on side of the communicator was empty (icsplit, intercom create #2).
in this case remote_size == 0 and there is no way to distinguish between
inter- and intra-communicator.
Conclusion: We do need to distinguish between intra- and inter-communicators.
So we should use ompi_mpi_group_null.group.
2015-12-08 08:43:08 +02:00
Nathan Hjelm
63d8feb31c Merge pull request #1187 from hjelmn/bsend_fix
pml/ob1: add missing ompi_request_wait_completion for buffered sends
2015-12-07 23:09:04 -07:00
Nathan Hjelm
f68c315188 pml/ob1: add missing ompi_request_wait_completion for buffered sends
This commit adds a call to ompi_request_wait_completion for buffered
sends. Without this line it is possible to get into a state where the
data is never sent.

Fixes open-mpi/ompi#1185

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-07 22:28:07 -07:00
Artem Polyakov
90b4148f9b Merge pull request #1184 from artpol84/intercomm_split_fix
Fix intercommunicator split (was triggered by MPICH/icsend test)
2015-12-08 08:46:38 +05:00
Nathan Hjelm
eb830b9501 ompi_proc_pack: correctly handle proc sentinels
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-07 17:27:38 -07:00