1
1
Граф коммитов

6152 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
1504ffb18d ompi_file_delete: output a better error message
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-12-02 11:08:04 -05:00
Gilles Gouaillardet
fe4c4e95eb coll/libnbc: fix MPI_IN_PLACE handling in i{gather,scatter}[v]
MPI_IN_PLACE is only relevant on the root task, so only test is there

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 13:59:25 +09:00
Gilles Gouaillardet
1a8a276914 coll/libnbc: use zero-size messages in ibarrier
and silence a valgrind warning

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 13:59:25 +09:00
Gilles Gouaillardet
2eec6a08b5 coll/base: fix ompi_coll_base_reduce_scatter_intra_nonoverlapping() with MPI_IN_PLACE
invoke underlying scatterv with MPI_IN_PLACE when appropriate

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 13:59:24 +09:00
Gilles Gouaillardet
8b7999469b coll/base: fix MPI_IN_PLACE in ompi_coll_base_reduce_generic()
avoid copying data to itself when MPI_IN_PLACE is used

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 13:59:24 +09:00
Gilles Gouaillardet
3f1486a508 pml/ob1: initialize one more field in mca_pml_ob1_recv_request_progress_rget()
always initialize recvreq->req_rdma_offset to zero.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 13:14:23 +09:00
Gilles Gouaillardet
15098161a3 coll/libnbc: add some comments on how locks are used
no code change

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-30 17:29:51 +09:00
Ralph Castain
d5fd635efe Bring forward the debugger-related changes
Refs https://github.com/open-mpi/ompi/pull/2425

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-29 13:15:20 -08:00
Valentin Petrov
4cdb8ecaad coll/hcoll: hcoll_context_free
Adds the new API hcoll_conetxt_free that resolves the issues
    observed with the ctx cache and group_destroy_notify.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2016-11-29 07:33:05 +02:00
Jeff Squyres
34ea3ce25a Merge pull request #1946 from thananon/romio-add-notes
romio: update REFRESH_NOTES to accommodate the random() patch.
2016-11-28 16:37:23 -05:00
KAWASHIMA Takahiro
9bfca8b274 pml/ob1: Reduce per-rank memory footprint slightly
`sturct mca_pml_ob1_comm_proc_t`, which is allocated per
connected rank in a communicator, had two paddings after
`expected_sequence` and `send_sequence` by alignments.
By changing the order of the members, the size of
`mca_pml_ob1_comm_proc_t` is reduced by 8 bytes on 64-bit
architectures.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2016-11-28 19:20:48 +09:00
Edgar Gabriel
b10558c3da fcoll/dynamic_gen2: fix bug exposed by uneven distribution of data
This fixes a bug reported in-house occuring with this component. It is triggered if the data assigned to different aggregators is highly differing, leading to different number of internal iterations required to handle it.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2016-11-24 13:02:19 -06:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Gilles Gouaillardet
2c94a3a6f3 coll/libnbc: fix race condition with multi threaded apps
protect the mca_coll_libnbc_component.active_requests list with
the new mca_coll_libnbc_component.lock mutex.

Thanks Jie Hu for the report

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-21 10:21:47 +09:00
Jijo Varghese
25e138ea1d error correction to the MPI_file operations thread safety lock
Signed-off-by: Jijo Varghese <jijo733@gmail.com>
2016-11-17 08:18:49 -05:00
Gilles Gouaillardet
bd364d29f7 osc/sm: plug an other memory leak in ompi_osc_sm_free
Fixes open-mpi/ompi@f1b473ee63

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-14 23:19:07 -07:00
Gilles Gouaillardet
f1b473ee63 osc/sm: plug a memory leak in ompi_osc_sm_free
Thanks Joseph Schuchart for the report.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-14 22:22:43 -07:00
Joshua Hursey
5a8b2f7431 topo/base: Fix module reference in collective call
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-14 11:34:54 -06:00
Gilles Gouaillardet
fc776e3fa5 coll: code cleanup
- instead of coll_base_comm_get_reqs(2) for irecv/isend, use only
   one request allocated in the stack and do a irecv/send

 - instead of ompi_request_wait_all(2), simpy ompi_request_wait

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-13 22:35:33 -07:00
Gilles Gouaillardet
99d30353af coll: Don't allocate space for zero requests
Refs #2402

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-13 22:20:58 -07:00
George Bosilca
725277bc26 Don't allocate space for the requests if the
underlying topology has no neighbors.

This commit fixes issue #2402.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2016-11-12 18:01:09 -05:00
Gilles Gouaillardet
023d18abae pml/ob1: mca_pml_ob1_recv must have memchecker mark the buffer as defined upon success
this is generally done in mca_pml_ob1_recv_request_free(), but this is not invoked
in via mca_pml_ob1_recv(), so do it manually

Thanks Yvan Fournier for the report

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-07 13:10:15 +09:00
Jeff Squyres
f11b0c7edf Merge pull request #2330 from jjhursey/topic/ibcast-non-uniform-dt-wa
coll/libnbc: Work around for non-uniform data types in ibcast
2016-11-05 10:26:04 -04:00
Joshua Hursey
350ef67fe0 coll/libnbc: Work around for non-uniform data types in ibcast
* If (legal) non-uniform data type signatures are used in ibcast
   then the chosen algorithm may fail on the request, and worst case
   it could produce wrong answers.
 * Add an MCA parameter that, by default, protects the user from this
   scenario. If the user really wants to use it then they have to
   'opt-in' by setting the following parameter to false:
   - `-mca coll_libnbc_ibcast_skip_dt_decision f`
 * Once the following Issues are resolved then this parameter can
   be removed.
   - https://github.com/open-mpi/ompi/issues/2256
   - https://github.com/open-mpi/ompi/issues/1763

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:33:23 -05:00
Yossi Itigin
17c8f76411 pml_ucx: fix uninitialized field req_status->_cancelled.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2016-11-01 17:02:22 +02:00
Thananon Patinyasakdikul
ea2d38de14 romio: update REFRESH_NOTES to accommodate the random() patch.
From patch: open-mpi/ompi@23b27c510c
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
2016-10-31 16:08:08 -04:00
Joshua Ladd
d27b680de2 Merge pull request #2305 from vspetrov/hcoll_fortran_pair_types
coll/hcoll fortran pair types
2016-10-28 12:05:00 -04:00
Gilles Gouaillardet
af67183e2f pml/v: fix a memory leak
close the framework if no more component should be used

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-28 09:32:30 +09:00
Valentin Petrov
2b7e362e56 coll/hcoll fortran pair types
Adds mapping of the MPI Fortran pair types (2INTEGER, 2REAL, 2DBLPREC)
    to the corresponding hcoll dtypes.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2016-10-27 18:24:07 +03:00
Edgar Gabriel
2076622924 Merge pull request #2238 from edgargabriel/pr/delete-error-codes
update the error codes reported by file_delete
2016-10-25 12:38:03 -05:00
George Bosilca
028e747470 Do not alter ompi_coll_tuned_use_dynamic_rules.
This is set globally as an MCA parameter and should be never
altered based on a single communicator setting.
2016-10-25 12:17:25 -04:00
George Bosilca
253eb80e26 Code cleaning of the tuned module. 2016-10-25 12:17:25 -04:00
Edgar Gabriel
74441b960b update the error codes reported by file_delete 2016-10-25 10:15:14 -05:00
Gilles Gouaillardet
8e788b5aee pml/ob1: refactor append_recv_req_to_queue() to improve readability
and fix a typo in a comment

Thanks George for the patch
2016-10-25 10:50:40 +09:00
Gilles Gouaillardet
4a886ac4cc pml/ob1: correctly reset receive request type before init
recvreq->req_recv.req_base.req_type should always be set before invoking
MCA_PML_OB1_RECV_REQUEST_INIT(recvreq, ...) otherwise, the previous type
might be set, and you could end up with MPC_PML_REQUEST_IMPROBE when
MCA_PML_REQUEST_RECV is expected.

Thanks Chris Pattison for the report and test case.

Fixes open-mpi/ompi#2275
2016-10-24 16:50:23 +09:00
Gilles Gouaillardet
6714f6aee7 coll/libnbc: fix MPI_Ialltoallv with MPI_IN_PLACE and without MPI param check 2016-10-24 09:29:06 +09:00
Josh Hursey
d1ecc83e14 Merge pull request #2245 from jjhursey/topic/libnbc-error-path
coll/libnbc: Fix error path on internal error
2016-10-21 13:27:17 -05:00
Joshua Hursey
8748e54c11 coll/libnbc: Fix error path on internal error
* If an error is detected internal to libnbc (e.g., PML truncation error)
   this patch makes sure that the request is completed and the `MPI_ERROR`
   field is set approprately.
 * Make an attempt to cleanup outstanding requests before returning.
   - This is a "best attempt" since not all PMLs support canceling requests.
2016-10-21 11:41:08 -04:00
Gilles Gouaillardet
45336d0bea libnbc: fix iallgather[v]
In order to optimize for MPI_IN_PLACE, data is sent from the receive buffer.
consequently, it should be sent with the receive type and count.

Thanks Josh Hursey for the report and test case

Refs open-mpi/ompi#2256
2016-10-21 10:24:25 +09:00
Gilles Gouaillardet
e78fcc4db9 coll/base: fix ompi_coll_base_{gather,scatter}_intra_binomial
receive type is only relevant for root with gather,
send type is only relevant for root with scatter,
so do not access these types on a non root task
2016-10-19 14:05:22 +09:00
Gilles Gouaillardet
1e3191115b Merge pull request #2172 from ggouaillardet/topic/ialltoall_in_place
support MPI_IN_PLACE in MPI_Ialltoall*
2016-10-17 17:00:47 +09:00
Joshua Ladd
64a15188bd Merge pull request #2199 from vspetrov/coll_hcoll_ialltoallv
coll/hcoll: ialltoallv interface
2016-10-14 07:59:23 -06:00
Gilles Gouaillardet
9389de4199 topo/treematch: fix displacements in mca_topo_treematch_dist_graph_create() 2016-10-14 17:16:49 +09:00
Joshua Ladd
b661307e6f Merge pull request #2218 from yosefe/topic/ucx-pml-spml-update
ucx: adapt pml_ucx and spml_ucx to new UCX APIs
2016-10-13 09:23:37 -04:00
Gilles Gouaillardet
958e29f929 osc/rdma: silence a warning
declare a local variable volatile and silence CID 1372692
2016-10-13 16:10:07 +09:00
Yossi Itigin
05ca466c6b ucx: adapt pml_ucx and spml_ucx to new UCX APIs
- pass field_mask to ucp_init().
- use non-blocking disconnect.
- recv() with pre-allocated request.
- call opal_progress() from iprobe() and improbe().
- use shift pattern in connect/disconnect.
2016-10-12 23:45:45 +03:00
Nathan Hjelm
e8ef503bee osc/rdma: fix warnings
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-10-12 10:17:25 -06:00
Nathan Hjelm
432d79046b Merge pull request #2197 from tkordenbrock/topic/master/osc-rdma.put.use.true_extent
osc-rdma: fix datatype lower bound errors in ompi_osc_rdma_master()
2016-10-11 10:42:02 -06:00
Valentin Petrov
9747a9ea9b coll/hcoll: ialltoallv interface 2016-10-10 15:09:07 +03:00