Mike Dubman
c544620a7c
Merge pull request #1138 from igor-ivanov/pr/yalla-valgrind
...
yalla: fix valgrind error due to uninitialized status field.
2015-11-20 07:19:11 -05:00
Gilles Gouaillardet
002c7b8b3a
fcoll/two_phase: use PMPI_* insted of MPI_*
2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
561e7f6647
vprotocol/pessimist: use internal ompi_* insted of MPI_*
2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
025fd8a9fc
osc: use PMPI_* insted of MPI_*
2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
d816d1c194
coll/libnbc: use PMPI_* and internal ompi_* insted of MPI_*
2015-11-20 13:46:19 +09:00
yosefe
3bb1270715
yalla: fix valgrind error due to uninitialized status field.
2015-11-19 10:59:31 +02:00
Francois WELLENREITER
9126ea5e82
MTL portals4 : improve the rendez-vous protocol using PtlTriggeredGet operation
2015-11-19 09:52:53 +01:00
Edgar Gabriel
9e5ade4e8b
argh, a debugging sleep statement got into the source code.
2015-11-16 13:26:57 -06:00
Edgar Gabriel
dbfbcdecd5
make adjustments for the default settings of grouping parameters and the default contiguous group size option.
...
minor bug fix in the simple grouping strategy.
2015-11-16 08:17:27 -06:00
Edgar Gabriel
27628774c7
add a new option for a simple aggregator selection which has zero communication costs.
2015-11-16 08:17:26 -06:00
Edgar Gabriel
66c1ea5fcb
change the default value of the grouping option. Also add new grouping option which skips the refinement step in the aggregator selection.
2015-11-16 08:17:23 -06:00
Edgar Gabriel
e8e117503d
reduce the communication volume during MPI_File_set_view
2015-11-16 08:17:22 -06:00
yohann
005400a937
mtl/ofi: Make sure the resources are managed by the provider.
2015-11-13 16:16:58 -08:00
Nathan Hjelm
9ef0821856
osc/rdma: fix some threading bugs
...
There were two bugs in osc/rdma when using threads:
- Deadlock is ompi_osc_rdma_start_atomic. This occurs because
ompi_osc_rdma_frag_alloc is called with the module lock. To fix the
issue the module lock is now recursive. In the future I will add a
new lock to protect just the current rdma fragment.
- Do not drop the lock in ompi_osc_rdma_frag_alloc when calling
ompi_osc_rdma_frag_complete. Not only is it not needed but dropping
the lock at this point can cause a competing thread to mess up the
state.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-12 20:25:57 -07:00
Yossi
b750b72a81
Merge pull request #1127 from yosefe/topic/pml-ucx-implement-cancel
...
pml_ucx: implement cancel, and add small optimizations.
2015-11-12 10:50:48 +02:00
yosefe
7becc54d67
pml_ucx: fix typo.
2015-11-12 09:57:41 +02:00
Todd Kordenbrock
b9630f802b
Merge pull request #1120 from francois-wellenreiter/mtl_min_mdbind
...
mtl-portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-10 14:34:19 -06:00
yosefe
d66b01d380
pml_ucx: implement cancel, and add small optimizations.
2015-11-10 17:40:06 +02:00
Gilles Gouaillardet
d6ff25b9a2
pml/monitoring: initialize common symbols
2015-11-10 13:58:54 +09:00
Jeff Squyres
e89ecac83c
bml r2: fix exclusivity comparison
...
Fixes open-mpi/ompi#1106
2015-11-06 13:26:32 -08:00
Francois WELLENREITER
b301b49a40
MTL portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-06 15:55:44 +01:00
yosefe
45c3d04857
pml_ucx: fix request construct/destruct.
...
We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
2015-11-04 11:03:46 +02:00
Todd Kordenbrock
cefe50cf54
mtl-portals4: test for valid handle before releasing resources
...
During component finalize, mtl-portals4 would blindly release
resources without testing if the handle was valid. This was OK,
but resource allocation is now delayed until add_procs(). If
mtl-portals4 is deselected, it will be finalized without
add_procs() ever being called. This commit ensures that invalid
handles are not released.
2015-11-02 21:01:14 -06:00
George Bosilca
5c60e76669
Fix Coverity CIDs 1338021, 1338020, 1338019, 1338018.
2015-11-02 17:38:51 -05:00
George Bosilca
b77c203068
Add more comments and restore the progress, flags, max tag, and max
...
context_id from the original PML.
2015-10-31 17:13:35 -04:00
George Bosilca
3efd494972
Make sure the monitoring infrastructure works well with the
...
new dynamic add_procs.
2015-10-31 17:13:35 -04:00
Guillaume Papauré
86714ad91e
change pml_monitoring_messages_count and pml_monitoring_messages_size pvars to use the start/stop features
2015-10-31 17:13:35 -04:00
George Bosilca
a43c2ce529
Fully integrate the monitoring with the MPI_T PVAR.
...
Writing to the pml_monitoring_flush variable will set the filename of
the output file.
Stopping a session for the pml_monitoring_flush will force the
generation of the nobitoring output file (as long as the filename
is not NULL).
To reset the monitoring, une has to bind the pml_monitoring_flush to a
session.
2015-10-31 17:13:35 -04:00
George Bosilca
646a662721
Use the new group interface and add const to the PML send functions.
2015-10-31 17:13:35 -04:00
George Bosilca
5224a7ce4d
Allow the pvar to be written by invoking the associated callback.
...
Use a PVAR to generate the monitoring dump of the information into a
file.
Use the PVAR to instruct the PML monitoring when to do the dump.
2015-10-31 17:13:35 -04:00
George Bosilca
df167f4177
Rewrite the close logic to be more clean and cleaner.
2015-10-31 17:13:35 -04:00
George Bosilca
c801ffde86
Use MPI_T variables to handle the flush in a more MPI-blessed way.
...
Code cleanup.
Update the monitoring test to use MPI_T variables.
2015-10-31 17:13:35 -04:00
George Bosilca
4f88c82500
Fix a convertion problem and add a comment about the lack of component
...
retain in the new component infrastructure.
Clean Makefile.am to fix "make distcheck".
Update the gitignore rules.
2015-10-31 17:13:35 -04:00
George Bosilca
80343a0d39
add ability to querry pml monitorinting results with MPI Tools interface
...
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"
Per Brice suggestion make all data count and message length be
uint64_t.
2015-10-31 17:13:35 -04:00
George Bosilca
a47d69202f
Add a monitoring PML. This PML track all data exchanges by the processes
...
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
2015-10-31 17:13:35 -04:00
Rolf vandeVaart
578385ca78
Merge pull request #1079 from rolfv/pr/cuda-require-41
...
Make CUDA 4.1 a requirement for CUDA-aware support
2015-10-29 12:56:22 -04:00
Nathan Hjelm
b1e3936261
Merge pull request #1078 from rolfv/pr/disable-osc-rdma-for-cuda
...
Disable the use of osc rdma when we detect a GPU buffer
2015-10-29 10:03:28 -06:00
Rolf vandeVaart
f2ff6e03ab
Make CUDA 4.1 a requirement for CUDA-aware support.
...
Remove all related preprocessor conditionals.
2015-10-29 11:24:02 -04:00
Matias Cabral
8ebcac1b2c
Merge pull request #1075 from matcabral/psm2_symbol_rename
...
Updated psm2 mtl with new externally exposed symbols of psm2.so
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 13:55:45 -07:00
Rolf vandeVaart
87a4cc6118
Disable the use of osc rdma when we detect a GPU buffer as it is not supported in that component.
...
This forces a failover to the osc pt2pt component. Fixes #1012
2015-10-28 14:47:45 -04:00
yosefe
ae738d0434
pml_ucx: add pmi fence in del_procs
2015-10-28 18:34:36 +02:00
Matias A Cabral
ed16d8e1cc
Updated psm2 mtl with new externally exposed symbols of psm2.so
...
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 09:12:33 -07:00
yosefe
41b6230be3
pml_ucx: fix debug macros, and initialize mpi request properly.
2015-10-28 10:59:25 +02:00
yohann
8bf1c95cdc
mtl/ofi: Remove unused help messages.
2015-10-27 09:38:04 -07:00
Nathan Hjelm
69d403d42b
Merge pull request #1054 from hjelmn/add_procs_threading
...
add_procs: add threading protection for dynamic add_procs
2015-10-27 09:27:13 -06:00
yohann
a111d66f0f
mtl/ofi: Change hints to FI_PROGRESS_MANUAL.
2015-10-26 15:32:30 -07:00
yohann
fde8b89ceb
mtl/ofi: Use OFI's representation of ANY_SRC instead of NULL.
2015-10-26 14:38:41 -07:00
yohann
4246de4508
mtl/ofi: Treat error correctly.
2015-10-26 14:38:33 -07:00
George Bosilca
2622b9d3a1
Fix minor issues in the treematch topo
...
based on a patch provided by Guillaume.
2015-10-25 21:38:59 -04:00
Jeff Squyres
140cf90e3e
osc_rdma: minor compiler warning stomp
2015-10-23 06:21:56 -07:00