openmpi

Автор	SHA1	Сообщение	Дата
Gilles Gouaillardet	1daa80d78f	mtl/psm2: plug a memory leak in ompi_mtl_psm2_component_open() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 09:28:32 +09:00
Joshua Ladd	57c0c847d0	Merge pull request #2603 from xinzhao3/topic/revert-ucx-mt Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."	2017-01-04 11:50:37 -05:00
Ralph Castain	66131b4183	Remove the bcol, coll/ml, and sbgp code as stale and lacking a maintainer Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-03 19:32:48 -08:00
Ralph Castain	dadc6fbaf6	Merge pull request #2448 from thananon/remove_request_lock Completely removed ompi_request_lock and ompi_request_cond	2017-01-03 19:31:46 -08:00
Jeff Squyres	33d2988985	Merge pull request #2647 from OMGtechy/master Fixed -Wmisleading-indentation in ad_read_coll.c	2017-01-03 12:24:22 -05:00
Ralph Castain	fe68f23099	Only instantiate the HWLOC topology in an MPI process if it actually will be used. There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced: * shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string instead of the topology itself, if available, thus avoiding instantiating the topology * openib BTL. This uses the distance matrix. At present, I haven't developed a method for replacing that reference. Thus, this component will instantiate the topology * usnic BTL. Uses the distance matrix. * treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate the topology * ess base functions. If a process is direct launched and not bound at launch, this code attempts to bind it. Thus, procs in this scenario will instantiate the topology Note that instantiating the topology on complex chips such as KNL can consume megabytes of memory. Fix pernode binding policy Properly handle the unbound case Correct pointer usage Do not free static error messages! Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-29 10:33:29 -08:00
Joshua Gerrard	94e87654c6	Fixed -Wmisleading-indentation in ad_read_coll.c Signed-off-by: Joshua Gerrard <joshuagerrard+ompi-commit@protonmail.com>	2016-12-28 20:14:13 +00:00
Xin Zhao	2d77912c19	Revert "PML/SPML/UCX: add UCX MT support to PML and SPML." This reverts commit `0ecf3c951c`. Signed-off-by: Xin Zhao <xinz@mellanox.com>	2016-12-19 18:57:48 +02:00
Mark Allen	eec1d5bf2e	osc/pt2pt: Fix hang with Put and Win_lock_all * When using `MPI_Put` with `MPI_Win_lock_all` a hang is possible since the `put` is waiting on `eager_send_active` to become `true` but that variable might not be reset in the case of `MPI_Win_lock_all` depending on other incoming events (e.g., `post` or ACKs of lock requests. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:53 -05:00
Mark Allen	0d1336b4a8	osc/pt2pt: Fix Lock/Unlock and Get wrong answer * When using `MPI_Lock`/`MPI_Unlock` with `MPI_Get` and non-contiguous datatypes is is possible that the unlock finishes too early before the data is actually present in the recv buffer. * We need to wait for the irecv to complete before unlocking the target. This commit waits for the outgoing fragment counts to become equal before unlocking. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:51 -05:00
Mark Allen	1ebf9fd3a4	osc/pt2pt: Fix PSCW after Fence wrong answer. * If the user uses PSCW synchronization after a Fence then the previous epoch is not reset which can cause the PSCW to transfer data before it is ready leading to wrong answers. * This commit resets the `eager_send_active` in the start call. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-12-16 11:52:49 -05:00
Xin Zhao	0ecf3c951c	PML/SPML/UCX: add UCX MT support to PML and SPML. Signed-off-by: Xin Zhao <xinz@mellanox.com>	2016-12-15 23:59:15 +02:00
Ralph Castain	585540bcee	Reduce the flood of warnings due to uninitialized variables, mismatched types, and unused things to a more bearable trickle Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-12-14 16:33:50 -08:00
Yossi	fa6e263821	Merge pull request #2537 from alinask/topic/pml-spml-ucx-api PML/SPML/UCX: Adapt to the API changes in the UCX lib.	2016-12-13 20:01:47 +02:00
KAWASHIMA Takahiro	6510800c16	ompi/request: Fix a persistent request creation bug According to the MPI-3.1 p.52 and p.53 (cited below), a request created by `MPI_*_INIT` but not yet started by `MPI_START` or `MPI_STARTALL` is inactive therefore `MPI_WAIT` or its friends must return immediately if such a request is passed. The current implementation hangs in `MPI_WAIT` and its friends in such case because a persistent request is initialized as `req_complete = REQUEST_PENDING`. This commit fixes the initialization. Also, this commit fixes internal requests used in `MPI_PROBE` and `MPI_IPROBE` which was marked wrongly as persistent. MPI-3.1 p.52: We shall use the following terminology: A null handle is a handle with value MPI_REQUEST_NULL. A persistent request and the handle to it are inactive if the request is not associated with any ongoing communication (see Section 3.9). A handle is active if it is neither null nor inactive. An empty status is a status which is set to return tag = MPI_ANY_TAG, source = MPI_ANY_SOURCE, error = MPI_SUCCESS, and is also internally configured so that calls to MPI_GET_COUNT, MPI_GET_ELEMENTS, and MPI_GET_ELEMENTS_X return count = 0 and MPI_TEST_CANCELLED returns false. We set a status variable to empty when the value returned by it is not significant. Status is set in this way so as to prevent errors due to accesses of stale information. MPI-3.1 p.53: One is allowed to call MPI_WAIT with a null or inactive request argument. In this case the operation returns immediately with empty status. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2016-12-08 21:42:05 +09:00
Alina Sklarevich	e9d2d029c6	PML/SPML/UCX: Adapt to the API changes in the UCX lib. Signed-off-by: Alina Sklarevich <alinas@mellanox.com>	2016-12-08 11:33:29 +02:00
Joshua Ladd	59f40e7cc5	Merge pull request #2500 from vspetrov/hcoll_ctx_free_detection Detect hcoll_context_free at config	2016-12-05 22:39:40 -05:00
Jeff Squyres	40d94fdc5a	Merge pull request #2422 from edgargabriel/pr/cycle-buf-default-val io/ompio: change the default value of mca parameter	2016-12-05 15:33:52 -05:00
Valentin Petrov	e13e264185	Detect hcoll_context_free at config Needed for better flexibility with versioning Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2016-12-02 22:09:20 +02:00
Jeff Squyres	1504ffb18d	ompi_file_delete: output a better error message Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-12-02 11:08:04 -05:00
Gilles Gouaillardet	fe4c4e95eb	coll/libnbc: fix MPI_IN_PLACE handling in i{gather,scatter}[v] MPI_IN_PLACE is only relevant on the root task, so only test is there Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 13:59:25 +09:00
Gilles Gouaillardet	1a8a276914	coll/libnbc: use zero-size messages in ibarrier and silence a valgrind warning Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 13:59:25 +09:00
Gilles Gouaillardet	2eec6a08b5	coll/base: fix ompi_coll_base_reduce_scatter_intra_nonoverlapping() with MPI_IN_PLACE invoke underlying scatterv with MPI_IN_PLACE when appropriate Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 13:59:24 +09:00
Gilles Gouaillardet	8b7999469b	coll/base: fix MPI_IN_PLACE in ompi_coll_base_reduce_generic() avoid copying data to itself when MPI_IN_PLACE is used Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 13:59:24 +09:00
Gilles Gouaillardet	3f1486a508	pml/ob1: initialize one more field in mca_pml_ob1_recv_request_progress_rget() always initialize recvreq->req_rdma_offset to zero. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-12-01 13:14:23 +09:00
Gilles Gouaillardet	15098161a3	coll/libnbc: add some comments on how locks are used no code change Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-30 17:29:51 +09:00
Ralph Castain	d5fd635efe	Bring forward the debugger-related changes Refs https://github.com/open-mpi/ompi/pull/2425 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-29 13:15:20 -08:00
Valentin Petrov	4cdb8ecaad	coll/hcoll: hcoll_context_free Adds the new API hcoll_conetxt_free that resolves the issues observed with the ctx cache and group_destroy_notify. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2016-11-29 07:33:05 +02:00
Jeff Squyres	34ea3ce25a	Merge pull request #1946 from thananon/romio-add-notes romio: update REFRESH_NOTES to accommodate the random() patch.	2016-11-28 16:37:23 -05:00
KAWASHIMA Takahiro	9bfca8b274	pml/ob1: Reduce per-rank memory footprint slightly `sturct mca_pml_ob1_comm_proc_t`, which is allocated per connected rank in a communicator, had two paddings after `expected_sequence` and `send_sequence` by alignments. By changing the order of the members, the size of `mca_pml_ob1_comm_proc_t` is reduced by 8 bytes on 64-bit architectures. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2016-11-28 19:20:48 +09:00
Edgar Gabriel	b10558c3da	fcoll/dynamic_gen2: fix bug exposed by uneven distribution of data This fixes a bug reported in-house occuring with this component. It is triggered if the data assigned to different aggregators is highly differing, leading to different number of internal iterations required to handle it. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2016-11-24 13:02:19 -06:00
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit `cb55c88a8b`.	2016-11-22 15:03:20 -08:00
Thananon Patinyasakdikul	b25a8c3fa5	Completely removed ompi_request_lock and ompi_request_cond as we dont need them anymore. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>	2016-11-22 17:58:31 -05:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Gilles Gouaillardet	2c94a3a6f3	coll/libnbc: fix race condition with multi threaded apps protect the mca_coll_libnbc_component.active_requests list with the new mca_coll_libnbc_component.lock mutex. Thanks Jie Hu for the report Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-21 10:21:47 +09:00
Jijo Varghese	25e138ea1d	error correction to the MPI_file operations thread safety lock Signed-off-by: Jijo Varghese <jijo733@gmail.com>	2016-11-17 08:18:49 -05:00
Edgar Gabriel	26e9210b15	io/ompio: change the default value of mca parameter change the default value of the mca_io_ompio_cycle_buffer_size parameter in order to avoid accidental truncation of a file for very large individual operations. Thanks to @cniethammer for reporting it. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2016-11-15 10:09:43 -06:00
Gilles Gouaillardet	bd364d29f7	osc/sm: plug an other memory leak in ompi_osc_sm_free Fixes open-mpi/ompi@f1b473ee63 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-14 23:19:07 -07:00
Gilles Gouaillardet	f1b473ee63	osc/sm: plug a memory leak in ompi_osc_sm_free Thanks Joseph Schuchart for the report. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-14 22:22:43 -07:00
Joshua Hursey	5a8b2f7431	topo/base: Fix module reference in collective call Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-11-14 11:34:54 -06:00
Gilles Gouaillardet	fc776e3fa5	coll: code cleanup - instead of coll_base_comm_get_reqs(2) for irecv/isend, use only one request allocated in the stack and do a irecv/send - instead of ompi_request_wait_all(2), simpy ompi_request_wait Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-13 22:35:33 -07:00
Gilles Gouaillardet	99d30353af	coll: Don't allocate space for zero requests Refs #2402 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-13 22:20:58 -07:00
George Bosilca	725277bc26	Don't allocate space for the requests if the underlying topology has no neighbors. This commit fixes issue #2402. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2016-11-12 18:01:09 -05:00
Gilles Gouaillardet	023d18abae	pml/ob1: mca_pml_ob1_recv must have memchecker mark the buffer as defined upon success this is generally done in mca_pml_ob1_recv_request_free(), but this is not invoked in via mca_pml_ob1_recv(), so do it manually Thanks Yvan Fournier for the report Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-11-07 13:10:15 +09:00
Jeff Squyres	f11b0c7edf	Merge pull request #2330 from jjhursey/topic/ibcast-non-uniform-dt-wa coll/libnbc: Work around for non-uniform data types in ibcast	2016-11-05 10:26:04 -04:00
Joshua Hursey	350ef67fe0	coll/libnbc: Work around for non-uniform data types in ibcast * If (legal) non-uniform data type signatures are used in ibcast then the chosen algorithm may fail on the request, and worst case it could produce wrong answers. * Add an MCA parameter that, by default, protects the user from this scenario. If the user really wants to use it then they have to 'opt-in' by setting the following parameter to false: - `-mca coll_libnbc_ibcast_skip_dt_decision f` * Once the following Issues are resolved then this parameter can be removed. - https://github.com/open-mpi/ompi/issues/2256 - https://github.com/open-mpi/ompi/issues/1763 Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2016-11-01 13:33:23 -05:00
Yossi Itigin	17c8f76411	pml_ucx: fix uninitialized field req_status->_cancelled. Signed-off-by: Yossi Itigin <yosefe@mellanox.com>	2016-11-01 17:02:22 +02:00
Thananon Patinyasakdikul	ea2d38de14	romio: update REFRESH_NOTES to accommodate the random() patch. From patch: open-mpi/ompi@23b27c510c Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>	2016-10-31 16:08:08 -04:00
Joshua Ladd	d27b680de2	Merge pull request #2305 from vspetrov/hcoll_fortran_pair_types coll/hcoll fortran pair types	2016-10-28 12:05:00 -04:00
Gilles Gouaillardet	af67183e2f	pml/v: fix a memory leak close the framework if no more component should be used Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2016-10-28 09:32:30 +09:00

1 2 3 4 5 ...

6173 Коммитов