1
1

8660 Коммитов

Автор SHA1 Сообщение Дата
igor.ivanov@itseez.com
c15bf147bf opal: Add opal_abort_print_stack mca variable with aliases for ompi/oshmem
This commit allows to control output during abnormal oshmem/ompi application
termination.
Fixed issue in backtrace output. HAVE_BACKTRACE was never set so user was limited
in control of this variable.
Two related mca variables are moved to opal layer. Corresponding aliases are
added for ompi and oshmem.
2015-11-25 18:18:33 +02:00
Ryan Grant
81d482dca6 Merge pull request #1137 from francois-wellenreiter/trig_mtl_rdv
MTL portals4 : improve the rendez-vous protocol using PtlTriggeredGet…
2015-11-24 17:31:31 -07:00
Ryan Grant
219581e87e Merge pull request #1090 from tkordenbrock/topic/check.for.invalid.handles.in.finalize
mtl-portals4: test for valid handle before releasing resources
2015-11-20 07:54:44 -06:00
Mike Dubman
c544620a7c Merge pull request #1138 from igor-ivanov/pr/yalla-valgrind
yalla: fix valgrind error due to uninitialized status field.
2015-11-20 07:19:11 -05:00
Gilles Gouaillardet
002c7b8b3a fcoll/two_phase: use PMPI_* insted of MPI_* 2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
561e7f6647 vprotocol/pessimist: use internal ompi_* insted of MPI_* 2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
025fd8a9fc osc: use PMPI_* insted of MPI_* 2015-11-20 13:46:19 +09:00
Gilles Gouaillardet
d816d1c194 coll/libnbc: use PMPI_* and internal ompi_* insted of MPI_* 2015-11-20 13:46:19 +09:00
yosefe
3bb1270715 yalla: fix valgrind error due to uninitialized status field. 2015-11-19 10:59:31 +02:00
Francois WELLENREITER
9126ea5e82 MTL portals4 : improve the rendez-vous protocol using PtlTriggeredGet operation 2015-11-19 09:52:53 +01:00
Edgar Gabriel
9e5ade4e8b argh, a debugging sleep statement got into the source code. 2015-11-16 13:26:57 -06:00
Edgar Gabriel
dbfbcdecd5 make adjustments for the default settings of grouping parameters and the default contiguous group size option.
minor bug fix in the simple grouping strategy.
2015-11-16 08:17:27 -06:00
Edgar Gabriel
27628774c7 add a new option for a simple aggregator selection which has zero communication costs. 2015-11-16 08:17:26 -06:00
Edgar Gabriel
66c1ea5fcb change the default value of the grouping option. Also add new grouping option which skips the refinement step in the aggregator selection. 2015-11-16 08:17:23 -06:00
Edgar Gabriel
e8e117503d reduce the communication volume during MPI_File_set_view 2015-11-16 08:17:22 -06:00
yohann
005400a937 mtl/ofi: Make sure the resources are managed by the provider. 2015-11-13 16:16:58 -08:00
Nathan Hjelm
9ef0821856 osc/rdma: fix some threading bugs
There were two bugs in osc/rdma when using threads:

 - Deadlock is ompi_osc_rdma_start_atomic. This occurs because
   ompi_osc_rdma_frag_alloc is called with the module lock. To fix the
   issue the module lock is now recursive. In the future I will add a
   new lock to protect just the current rdma fragment.

 - Do not drop the lock in ompi_osc_rdma_frag_alloc when calling
   ompi_osc_rdma_frag_complete. Not only is it not needed but dropping
   the lock at this point can cause a competing thread to mess up the
   state.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-12 20:25:57 -07:00
Yossi
b750b72a81 Merge pull request #1127 from yosefe/topic/pml-ucx-implement-cancel
pml_ucx: implement cancel, and add small optimizations.
2015-11-12 10:50:48 +02:00
yosefe
7becc54d67 pml_ucx: fix typo. 2015-11-12 09:57:41 +02:00
Todd Kordenbrock
b9630f802b Merge pull request #1120 from francois-wellenreiter/mtl_min_mdbind
mtl-portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-10 14:34:19 -06:00
yosefe
d66b01d380 pml_ucx: implement cancel, and add small optimizations. 2015-11-10 17:40:06 +02:00
Gilles Gouaillardet
d6ff25b9a2 pml/monitoring: initialize common symbols 2015-11-10 13:58:54 +09:00
Jeff Squyres
e89ecac83c bml r2: fix exclusivity comparison
Fixes open-mpi/ompi#1106
2015-11-06 13:26:32 -08:00
Francois WELLENREITER
b301b49a40 MTL portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages 2015-11-06 15:55:44 +01:00
Ralph Castain
bfdf08ae86 Fix intercomm_create by ensuring that both sides know how to translate jobid to/from nspace
Return something just to ensure that pack is happy
2015-11-06 02:19:45 -08:00
Nathan Hjelm
fda5daf453 Merge pull request #1096 from kawashima-fj/pr/fortran-var-type-fix
Fix Fortran variable types
2015-11-05 14:27:40 -07:00
Nathan Hjelm
acf3cb9b9b Merge pull request #1095 from kawashima-fj/pr/trivial-fixes
Some trivial fixes
2015-11-04 09:45:59 -07:00
yosefe
45c3d04857 pml_ucx: fix request construct/destruct.
We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
2015-11-04 11:03:46 +02:00
KAWASHIMA Takahiro
c09f9f05d3 mpi/tool: Fix an incorrect type cast.
This bug caused an invalid result value on `MPI_T_cvar_read`
on big-endian machines or for large (>=2Gi) cvar values.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
384f4b51d1 fortran: Fix: missing dimension(*) in (I)NEIGHBOR_ALLTOALLW. 2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
1092eabfab fortran: Update comment.
The structure was changed in commit 9c77c6b.
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
107c0073dd fortran: Fix: MPI_UNWEIGHTED and MPI_WEIGHTS_EMPTY should be arrays.
Without this modification, gfortran throw the following error
if these variables are used for `MPI_DIST_GRAPH_CREATE_ADJACENT` or
`MPI_DIST_GRAPH_CREATE_ADJACENT`.

  Error: There is no specific subroutine for the generic
  'mpi_dist_graph_create_adjacent' at (1)
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
d5e1f40a1e fortran: Fix: info should be an integer parameter. 2015-11-04 10:38:24 +09:00
KAWASHIMA Takahiro
9bf93810d7 fortran: Fix: array dimension of MPI_ARGVS_NULL.
`MPI_ARGVS_NULL` should be a two-dimensional array.
Without this modification, gfortran throw the following error
if `MPI_ARGVS_NULL` is used for `MPI_COMM_SPAWN_MULTIPLE`.

  Error: There is no specific subroutine for the generic
  'mpi_comm_spawn_multiple' at (1)
2015-11-04 10:38:24 +09:00
George Bosilca
b14212f142 Fix Coverity issue 1338059. 2015-11-02 22:51:52 -05:00
Todd Kordenbrock
cefe50cf54 mtl-portals4: test for valid handle before releasing resources
During component finalize, mtl-portals4 would blindly release
resources without testing if the handle was valid.  This was OK,
but resource allocation is now delayed until add_procs().  If
mtl-portals4 is deselected, it will be finalized without
add_procs() ever being called.  This commit ensures that invalid
handles are not released.
2015-11-02 21:01:14 -06:00
George Bosilca
5c60e76669 Fix Coverity CIDs 1338021, 1338020, 1338019, 1338018. 2015-11-02 17:38:51 -05:00
bosilca
f1a5362f94 Merge pull request #1072 from bosilca/topic/resized
Fix for the subarray and darray type creation issue.
2015-11-01 21:17:03 -05:00
George Bosilca
b77c203068 Add more comments and restore the progress, flags, max tag, and max
context_id from the original PML.
2015-10-31 17:13:35 -04:00
George Bosilca
3efd494972 Make sure the monitoring infrastructure works well with the
new dynamic add_procs.
2015-10-31 17:13:35 -04:00
Guillaume Papauré
86714ad91e change pml_monitoring_messages_count and pml_monitoring_messages_size pvars to use the start/stop features 2015-10-31 17:13:35 -04:00
George Bosilca
a43c2ce529 Fully integrate the monitoring with the MPI_T PVAR.
Writing to the pml_monitoring_flush variable will set the filename of
the output file.
Stopping a session for the pml_monitoring_flush will force the
generation of the nobitoring output file (as long as the filename
is not NULL).
To reset the monitoring, une has to bind the pml_monitoring_flush to a
session.
2015-10-31 17:13:35 -04:00
George Bosilca
646a662721 Use the new group interface and add const to the PML send functions. 2015-10-31 17:13:35 -04:00
George Bosilca
5224a7ce4d Allow the pvar to be written by invoking the associated callback.
Use a PVAR to generate the monitoring dump of the information into a
file.

Use the PVAR to instruct the PML monitoring when to do the dump.
2015-10-31 17:13:35 -04:00
George Bosilca
df167f4177 Rewrite the close logic to be more clean and cleaner. 2015-10-31 17:13:35 -04:00
George Bosilca
c801ffde86 Use MPI_T variables to handle the flush in a more MPI-blessed way.
Code cleanup.

Update the monitoring test to use MPI_T variables.
2015-10-31 17:13:35 -04:00
George Bosilca
4f88c82500 Fix a convertion problem and add a comment about the lack of component
retain in the new component infrastructure.

Clean Makefile.am to fix "make distcheck".

Update the gitignore rules.
2015-10-31 17:13:35 -04:00
George Bosilca
80343a0d39 add ability to querry pml monitorinting results with MPI Tools interface
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"

Per Brice suggestion make all data count and message length be
uint64_t.
2015-10-31 17:13:35 -04:00
George Bosilca
a47d69202f Add a monitoring PML. This PML track all data exchanges by the processes
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
2015-10-31 17:13:35 -04:00
Rolf vandeVaart
578385ca78 Merge pull request #1079 from rolfv/pr/cuda-require-41
Make CUDA 4.1 a requirement for CUDA-aware support
2015-10-29 12:56:22 -04:00