1
1

8612 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
80343a0d39 add ability to querry pml monitorinting results with MPI Tools interface
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"

Per Brice suggestion make all data count and message length be
uint64_t.
2015-10-31 17:13:35 -04:00
George Bosilca
a47d69202f Add a monitoring PML. This PML track all data exchanges by the processes
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
2015-10-31 17:13:35 -04:00
Rolf vandeVaart
578385ca78 Merge pull request #1079 from rolfv/pr/cuda-require-41
Make CUDA 4.1 a requirement for CUDA-aware support
2015-10-29 12:56:22 -04:00
Nathan Hjelm
b1e3936261 Merge pull request #1078 from rolfv/pr/disable-osc-rdma-for-cuda
Disable the use of osc rdma when we detect a GPU buffer
2015-10-29 10:03:28 -06:00
Rolf vandeVaart
f2ff6e03ab Make CUDA 4.1 a requirement for CUDA-aware support.
Remove all related preprocessor conditionals.
2015-10-29 11:24:02 -04:00
Matias Cabral
8ebcac1b2c Merge pull request #1075 from matcabral/psm2_symbol_rename
Updated psm2 mtl with new externally exposed symbols of psm2.so 
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 13:55:45 -07:00
Rolf vandeVaart
87a4cc6118 Disable the use of osc rdma when we detect a GPU buffer as it is not supported in that component.
This forces a failover to the osc pt2pt component. Fixes #1012
2015-10-28 14:47:45 -04:00
yosefe
ae738d0434 pml_ucx: add pmi fence in del_procs 2015-10-28 18:34:36 +02:00
Matias A Cabral
ed16d8e1cc Updated psm2 mtl with new externally exposed symbols of psm2.so
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 09:12:33 -07:00
yosefe
41b6230be3 pml_ucx: fix debug macros, and initialize mpi request properly. 2015-10-28 10:59:25 +02:00
Ralph Castain
267ca8fcd3 Cleanup the PMIx direct modex support. Add an MCA parameter pmix_base_async_modex that will cause the async modex to be used when set to 1. Default it to 0 for now
to continue current default behavior.

Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_
modex is used.
2015-10-27 17:31:56 -07:00
yohann
8bf1c95cdc mtl/ofi: Remove unused help messages. 2015-10-27 09:38:04 -07:00
Nathan Hjelm
69d403d42b Merge pull request #1054 from hjelmn/add_procs_threading
add_procs: add threading protection for dynamic add_procs
2015-10-27 09:27:13 -06:00
yohann
a111d66f0f mtl/ofi: Change hints to FI_PROGRESS_MANUAL. 2015-10-26 15:32:30 -07:00
yohann
fde8b89ceb mtl/ofi: Use OFI's representation of ANY_SRC instead of NULL. 2015-10-26 14:38:41 -07:00
yohann
4246de4508 mtl/ofi: Treat error correctly. 2015-10-26 14:38:33 -07:00
George Bosilca
2622b9d3a1 Fix minor issues in the treematch topo
based on a patch provided by Guillaume.
2015-10-25 21:38:59 -04:00
Gilles Gouaillardet
1105634ca1 mpi_f08: fix MPI_WIN_{ATTACH,DETACH} bindings
fixes INTENT from open-mpi/ompi@9600e2bc63

Thanks Jeff for pointing this !
2015-10-26 10:02:45 +09:00
George Bosilca
4ac247b1da Minor updated on the validity checks for the alltoall collectives. 2015-10-24 15:25:28 -04:00
Jeff Squyres
140cf90e3e osc_rdma: minor compiler warning stomp 2015-10-23 06:21:56 -07:00
Ralph Castain
4c12022a50 Silence a couple of warnings from valgrind and compilers. Since some pmix components may return success with a NULL value from a "get", check for that situation before attempting to unload the data. Preset the hostname before calling modex_recv to get it so unload properly checks for NULL. Cast a returned value to the correct ompi_proc_t pointer 2015-10-22 20:56:02 -07:00
Nathan Hjelm
9dad35b467 Merge pull request #1061 from hjelmn/osc_fixes
one-sided fixes
2015-10-22 18:23:19 -06:00
Nathan Hjelm
63e744ffc6 osc/rdma: use only a single btl registration for local state
This commit fixes a bug that can occur on Cray Gemini networks. If
multiple registrations are used for the local state then we looks the
atomicity guarantees. To avoid issues like this use only a single
registration handle for all local state on a node.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-22 15:51:19 -06:00
Nathan Hjelm
f690fc8fd5 osc/pt2pt: fix warnings
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-22 15:50:40 -06:00
Nathan Hjelm
e716866e0c Merge pull request #1057 from hjelmn/binding_fix
win: fix erroneous argument check
2015-10-22 15:15:43 -06:00
Jeff Squyres
86270e7613 MPI_File_open: add note about allowable chars in filenames
Thanks to @nasailja for the original text suggestion.
2015-10-22 11:56:53 -07:00
Nathan Hjelm
e4219aa692 Merge pull request #1059 from hjelmn/osc_fixes
osc/rdma: bug fixes
2015-10-22 11:25:51 -06:00
Nathan Hjelm
97c9732bad osc/rdma: bug fixes
This commit fixes the following:

 - CIDs 1328491, 1328492: Dead code caused by typos in a prior
   commit.

 - Fix the calculation of dynamic memory regions. This was causes
   incorrect RMA range errors when accessing the last partial page of
   an attachment.

 - Fix a SEGV when using dynamic memory windows with local state (all
   processes on the same node).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-22 09:49:38 -06:00
yohann
889c76634e mtl/ofi: Increase priority. 2015-10-22 08:39:36 -07:00
Nathan Hjelm
6ae57647ab win: fix erroneous argument check
When using dynamic memory windows the displacement becomes a
pointer. Since the high bit may be set on valid pointers on some
platforms the check for disp > 0 is invalid. This commit adds the
window flavor to ompi_win_t and disables the displacement check when
operating on dynamic memory windows.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-22 09:33:26 -06:00
Howard Pritchard
ce8e241922 Merge pull request #1055 from nrgraham23/java_warnings_fix
Fix Java related warnings
2015-10-22 08:17:45 -06:00
Nathan Hjelm
b2fa2a9bef Merge pull request #1056 from hjelmn/osc_fixes
osc/pt2pt: reset all_sync sync object before sending complete messages
2015-10-21 19:40:28 -06:00
Nathan Hjelm
864f88a2a3 osc/pt2pt: reset all_sync sync object before sending complete messages
This commit fixes a bug that occurs when a post message comes in when
sending complete messages or while waiting for all outgoing messages
to flush. In that case the post message might get incorrecly
associated with the ending sync object.

References open-mpi/ompi#1012

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-21 18:30:08 -06:00
Nathaniel Graham
c4d70ab425 Fix Java related warnings
This commit fixes java related warnings.

Fixes #881

Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>
2015-10-21 17:14:25 -07:00
Nathan Hjelm
08e267b811 add_procs: add threading protection for dynamic add_procs
This commit add protection to the group, ob1, and bml endpoint lookup
code. For ob1 and the bml a lock has been added. For performance
reasons the lock is only held if a bml or ob1 endpoint does not
exist. ompi_group_dense_lookup no uses opal_atomic_cmpset to ensure
the proc is only retained by the thread that actually updates the
group.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-21 16:13:41 -06:00
Nathan Hjelm
386991d590 Merge pull request #1052 from hjelmn/osc_rdma_fixes
osc/rdma: use standard verbosity levels
2015-10-21 13:21:50 -06:00
Nathan Hjelm
9476c7bbca osc/rdma: use standard verbosity levels
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-21 12:31:41 -06:00
yohann
abe5002ee9 mtl/ofi: remove threading and progress hints. 2015-10-21 10:25:08 -07:00
yosefe
cc76db8d39 ucx: reduce components priority to 5. 2015-10-21 17:38:25 +03:00
Gilles Gouaillardet
a0782e1c7e mpi: MPI_Neighbor_all* and MPI_Ineighbor_all* do not work with
inter communicators (fail with MPI_ERR_COMM) or non process topologies
communicators (fail with MPI_ERR_TOPOLOGY)
2015-10-21 16:21:19 +09:00
Mike Dubman
4ea13f10f6 Merge pull request #1008 from alex-mikheev/topic/ucx_support
UCX support for ompi and oshmem
2015-10-21 09:33:33 +03:00
Gilles Gouaillardet
3b0b929883 ompi: MPI_IN_PLACE is not a valid argument of MPI_Neighbor_all* and MPI_Ineighbor_all* 2015-10-21 14:46:35 +09:00
Gilles Gouaillardet
256976a108 mpi: MPI_IN_PLACE is not a valid argument of MPI_All* and MPI_Iall* with an inter communicator 2015-10-21 14:46:28 +09:00
Gilles Gouaillardet
9b31530d5c man: misc fix of revamp of MPI_File_* and MPI_Register_datarep man pages
that fixes commit open-mpi/ompi@b17c89c1e6 :
- remove unnecessary empty lines
- add USE mpi in MPI_Register_datarep

Thanks Jeff for noticing this
2015-10-21 09:41:01 +09:00
Nathan Hjelm
763744a32c Merge pull request #1046 from hjelmn/osc_rdma_fixes
osc/rdma: bug fixes
2015-10-20 16:44:38 -06:00
Nathan Hjelm
b8ee05d352 osc/rdma: bug fixes
This commit fixes several bugs in the osc/rdma component:

 - Complete aggregated requests immediately. Completion of RMA
   requests indicates local completion anyway. This fixes a hang in
   the c_reqops test.

 - Correctly mark Rget_accumulate requests.

 - Set the local base flag correctly on the local peer.

 - Clear or set the no locks flag on the window if the value is
   changed by MPI_Win_set_info.

 - Actually update the target when using MPI_OP_REPLACE.

Fixes open-mpi/ompi#1010

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-20 15:27:15 -06:00
Ryan Grant
f60c506c68 Merge pull request #999 from tkordenbrock/topic/add.triggered.gather
coll-portals4: add gather and igather implementations that use Portals4 triggered operations
2015-10-20 14:59:09 -06:00
yosefe
a313588337 ompi: Add UCX PML. 2015-10-20 19:46:06 +03:00
yosefe
502dc8aaa4 add pml-specific field in OMPI datatype.
PML UCX will use it to cache a handle for UCX datatype.
2015-10-20 19:46:06 +03:00
Jeff Squyres
630d6bf800 Merge pull request #1038 from kawashima-fj/pr/man-correction
man: Various manpage corrections
2015-10-20 06:40:05 -04:00