Edgar Gabriel
9e5ade4e8b
argh, a debugging sleep statement got into the source code.
2015-11-16 13:26:57 -06:00
Edgar Gabriel
dbfbcdecd5
make adjustments for the default settings of grouping parameters and the default contiguous group size option.
...
minor bug fix in the simple grouping strategy.
2015-11-16 08:17:27 -06:00
Edgar Gabriel
27628774c7
add a new option for a simple aggregator selection which has zero communication costs.
2015-11-16 08:17:26 -06:00
Edgar Gabriel
66c1ea5fcb
change the default value of the grouping option. Also add new grouping option which skips the refinement step in the aggregator selection.
2015-11-16 08:17:23 -06:00
Edgar Gabriel
e8e117503d
reduce the communication volume during MPI_File_set_view
2015-11-16 08:17:22 -06:00
yohann
005400a937
mtl/ofi: Make sure the resources are managed by the provider.
2015-11-13 16:16:58 -08:00
Nathan Hjelm
9ef0821856
osc/rdma: fix some threading bugs
...
There were two bugs in osc/rdma when using threads:
- Deadlock is ompi_osc_rdma_start_atomic. This occurs because
ompi_osc_rdma_frag_alloc is called with the module lock. To fix the
issue the module lock is now recursive. In the future I will add a
new lock to protect just the current rdma fragment.
- Do not drop the lock in ompi_osc_rdma_frag_alloc when calling
ompi_osc_rdma_frag_complete. Not only is it not needed but dropping
the lock at this point can cause a competing thread to mess up the
state.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-12 20:25:57 -07:00
Yossi
b750b72a81
Merge pull request #1127 from yosefe/topic/pml-ucx-implement-cancel
...
pml_ucx: implement cancel, and add small optimizations.
2015-11-12 10:50:48 +02:00
yosefe
7becc54d67
pml_ucx: fix typo.
2015-11-12 09:57:41 +02:00
Todd Kordenbrock
b9630f802b
Merge pull request #1120 from francois-wellenreiter/mtl_min_mdbind
...
mtl-portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-10 14:34:19 -06:00
yosefe
d66b01d380
pml_ucx: implement cancel, and add small optimizations.
2015-11-10 17:40:06 +02:00
Gilles Gouaillardet
d6ff25b9a2
pml/monitoring: initialize common symbols
2015-11-10 13:58:54 +09:00
Jeff Squyres
e89ecac83c
bml r2: fix exclusivity comparison
...
Fixes open-mpi/ompi#1106
2015-11-06 13:26:32 -08:00
Francois WELLENREITER
b301b49a40
MTL portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-06 15:55:44 +01:00
Ralph Castain
bfdf08ae86
Fix intercomm_create by ensuring that both sides know how to translate jobid to/from nspace
...
Return something just to ensure that pack is happy
2015-11-06 02:19:45 -08:00
Nathan Hjelm
fda5daf453
Merge pull request #1096 from kawashima-fj/pr/fortran-var-type-fix
...
Fix Fortran variable types
2015-11-05 14:27:40 -07:00
Nathan Hjelm
acf3cb9b9b
Merge pull request #1095 from kawashima-fj/pr/trivial-fixes
...
Some trivial fixes
2015-11-04 09:45:59 -07:00
yosefe
45c3d04857
pml_ucx: fix request construct/destruct.
...
We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
2015-11-04 11:03:46 +02:00
KAWASHIMA Takahiro
c09f9f05d3
mpi/tool: Fix an incorrect type cast.
...
This bug caused an invalid result value on `MPI_T_cvar_read`
on big-endian machines or for large (>=2Gi) cvar values.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
384f4b51d1
fortran: Fix: missing dimension(*)
in (I)NEIGHBOR_ALLTOALLW
.
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
1092eabfab
fortran: Update comment.
...
The structure was changed in commit 9c77c6b.
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
107c0073dd
fortran: Fix: MPI_UNWEIGHTED
and MPI_WEIGHTS_EMPTY
should be arrays.
...
Without this modification, gfortran throw the following error
if these variables are used for `MPI_DIST_GRAPH_CREATE_ADJACENT` or
`MPI_DIST_GRAPH_CREATE_ADJACENT`.
Error: There is no specific subroutine for the generic
'mpi_dist_graph_create_adjacent' at (1)
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
d5e1f40a1e
fortran: Fix: info
should be an integer parameter.
2015-11-04 10:38:24 +09:00
KAWASHIMA Takahiro
9bf93810d7
fortran: Fix: array dimension of MPI_ARGVS_NULL
.
...
`MPI_ARGVS_NULL` should be a two-dimensional array.
Without this modification, gfortran throw the following error
if `MPI_ARGVS_NULL` is used for `MPI_COMM_SPAWN_MULTIPLE`.
Error: There is no specific subroutine for the generic
'mpi_comm_spawn_multiple' at (1)
2015-11-04 10:38:24 +09:00
George Bosilca
b14212f142
Fix Coverity issue 1338059.
2015-11-02 22:51:52 -05:00
George Bosilca
5c60e76669
Fix Coverity CIDs 1338021, 1338020, 1338019, 1338018.
2015-11-02 17:38:51 -05:00
bosilca
f1a5362f94
Merge pull request #1072 from bosilca/topic/resized
...
Fix for the subarray and darray type creation issue.
2015-11-01 21:17:03 -05:00
George Bosilca
b77c203068
Add more comments and restore the progress, flags, max tag, and max
...
context_id from the original PML.
2015-10-31 17:13:35 -04:00
George Bosilca
3efd494972
Make sure the monitoring infrastructure works well with the
...
new dynamic add_procs.
2015-10-31 17:13:35 -04:00
Guillaume Papauré
86714ad91e
change pml_monitoring_messages_count and pml_monitoring_messages_size pvars to use the start/stop features
2015-10-31 17:13:35 -04:00
George Bosilca
a43c2ce529
Fully integrate the monitoring with the MPI_T PVAR.
...
Writing to the pml_monitoring_flush variable will set the filename of
the output file.
Stopping a session for the pml_monitoring_flush will force the
generation of the nobitoring output file (as long as the filename
is not NULL).
To reset the monitoring, une has to bind the pml_monitoring_flush to a
session.
2015-10-31 17:13:35 -04:00
George Bosilca
646a662721
Use the new group interface and add const to the PML send functions.
2015-10-31 17:13:35 -04:00
George Bosilca
5224a7ce4d
Allow the pvar to be written by invoking the associated callback.
...
Use a PVAR to generate the monitoring dump of the information into a
file.
Use the PVAR to instruct the PML monitoring when to do the dump.
2015-10-31 17:13:35 -04:00
George Bosilca
df167f4177
Rewrite the close logic to be more clean and cleaner.
2015-10-31 17:13:35 -04:00
George Bosilca
c801ffde86
Use MPI_T variables to handle the flush in a more MPI-blessed way.
...
Code cleanup.
Update the monitoring test to use MPI_T variables.
2015-10-31 17:13:35 -04:00
George Bosilca
4f88c82500
Fix a convertion problem and add a comment about the lack of component
...
retain in the new component infrastructure.
Clean Makefile.am to fix "make distcheck".
Update the gitignore rules.
2015-10-31 17:13:35 -04:00
George Bosilca
80343a0d39
add ability to querry pml monitorinting results with MPI Tools interface
...
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"
Per Brice suggestion make all data count and message length be
uint64_t.
2015-10-31 17:13:35 -04:00
George Bosilca
a47d69202f
Add a monitoring PML. This PML track all data exchanges by the processes
...
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
2015-10-31 17:13:35 -04:00
Rolf vandeVaart
578385ca78
Merge pull request #1079 from rolfv/pr/cuda-require-41
...
Make CUDA 4.1 a requirement for CUDA-aware support
2015-10-29 12:56:22 -04:00
Nathan Hjelm
b1e3936261
Merge pull request #1078 from rolfv/pr/disable-osc-rdma-for-cuda
...
Disable the use of osc rdma when we detect a GPU buffer
2015-10-29 10:03:28 -06:00
Rolf vandeVaart
f2ff6e03ab
Make CUDA 4.1 a requirement for CUDA-aware support.
...
Remove all related preprocessor conditionals.
2015-10-29 11:24:02 -04:00
Matias Cabral
8ebcac1b2c
Merge pull request #1075 from matcabral/psm2_symbol_rename
...
Updated psm2 mtl with new externally exposed symbols of psm2.so
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 13:55:45 -07:00
Rolf vandeVaart
87a4cc6118
Disable the use of osc rdma when we detect a GPU buffer as it is not supported in that component.
...
This forces a failover to the osc pt2pt component. Fixes #1012
2015-10-28 14:47:45 -04:00
yosefe
ae738d0434
pml_ucx: add pmi fence in del_procs
2015-10-28 18:34:36 +02:00
Matias A Cabral
ed16d8e1cc
Updated psm2 mtl with new externally exposed symbols of psm2.so
...
Fixes open-mpi/ompi#1018
Fixes open-mpi/ompi#1021
2015-10-28 09:12:33 -07:00
yosefe
41b6230be3
pml_ucx: fix debug macros, and initialize mpi request properly.
2015-10-28 10:59:25 +02:00
Ralph Castain
267ca8fcd3
Cleanup the PMIx direct modex support. Add an MCA parameter pmix_base_async_modex that will cause the async modex to be used when set to 1. Default it to 0 for now
...
to continue current default behavior.
Also add an MCA param pmix_base_collect_data to direct that the blocking fence shall return all data to each process. Obviously, this param has no effect if async_
modex is used.
2015-10-27 17:31:56 -07:00
yohann
8bf1c95cdc
mtl/ofi: Remove unused help messages.
2015-10-27 09:38:04 -07:00
Nathan Hjelm
69d403d42b
Merge pull request #1054 from hjelmn/add_procs_threading
...
add_procs: add threading protection for dynamic add_procs
2015-10-27 09:27:13 -06:00
George Bosilca
679dc9b437
Fix the subarray and darray type creation. Include a
...
small patch provided by Gilles.
2015-10-26 23:44:26 -04:00