openmpi

Автор	SHA1	Сообщение	Дата
Howard Pritchard	f96994b12f	Merge pull request #6865 from rhc54/cmr40/locality Provide locality for all procs on node	2019-08-19 13:26:59 -06:00
Howard Pritchard	7b09c15b90	Merge pull request #6892 from janjust/v4.0.x-osc_fix v4.0.x: osc/ucx: Fix possible win creation/destruction race condition	2019-08-19 13:26:32 -06:00
George Bosilca	c9f48e2e77	Whitespace cleanup No code or logic changes. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-08-16 10:27:43 -04:00
Edgar Gabriel	d72d39bfee	io_ompio_file_open: fix offset calculation with SEEK_END and SEEK_CUR. fixes an issue reported by Wei-keng Liao Fixes Issue #6858 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-08-16 09:03:10 -05:00
Ralph Castain	e17203b4f7	Silence Coverity warning Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-08-12 12:42:41 -07:00
Ralph Castain	14f3fbb8c1	Provide locality for all procs on node Update PMIx to latest master to get supporting updates. For connect/accept (part of comm_spawn as well), lookup locality for all participating procs on the node and compute the relative locality so it can be used for MPI operations. Signed-off-by: Ralph Castain <rhc@pmix.org> (cherry picked from commit `d202e10c14`)	2019-08-12 12:42:40 -07:00
Tomislav Janjusic	e9a0343780	osc/ucx: Fix possible win creation/destruction race condition To avoid fully initializing the osc/ucx component for MPI application that are not using One-Sided functionality, the initialization happens at the first MPI window creation. This commit ensures atomicity of global state modifications. ported from: `6678ac0f55` Signed-off-by: Artem Polyakov <artpol84@gmail.com> fix alignment, and fix error path	2019-08-12 22:23:17 +03:00
Gilles Gouaillardet	39ec580b76	coll/base: only retain datatypes/op if the request has not yet completed a non blocking collective might return ompi_request_null, so we should not retain anything in that case. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@63d3ccde9d)	2019-08-13 00:13:40 +09:00
Gilles Gouaillardet	ae26957619	coll/base: cleanup ompi_coll_base_nbc_request_t elements Since ompi_coll_base_nbc_request_t is to be used in an opal_free_list_t, it must be returned into a "clean" state. So cleanup some data in the callback completion subroutines. This fixes a regression introduced in open-mpi/ompi@0fe756d416 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@0862c409f1)	2019-08-13 00:13:40 +09:00
Gilles Gouaillardet	b37c85dcca	coll/libnbc: fixes ompi ompi_coll_libnbc_request_t parent base ompi_coll_libnbc_request_t on top of ompi_coll_base_nbc_request_t to correctly support the retention of datatypes/operators This fixes a regression introduced in open-mpi/ompi@0fe756d416 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@f8eef0fde9)	2019-08-13 00:13:40 +09:00
Sergey Oblomov	2fa112c0a6	UCX: added PPN hint for UCX context - added PPN hint for UCX context init Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit `43186e494b`) Conflicts: opal/mca/common/ucx/common_ucx_wpool.c	2019-08-09 11:51:30 +03:00
George Bosilca	8b794235b8	Update the datatype dump to match the actual types. Update the comments to better reflect what is going on. Minor indentations. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:37:47 -04:00
George Bosilca	4f754d0156	Optimized datatype description. Move toward a base type of vector (count, type, blocklen, extent, disp) with disp and extent applying toward the count repertition and blocklen being a contiguous memory of type type. Implement 2 optimizations on this description used during type_commit: - collapse: successive similar datatype descriptions are collapsed together with an increased count. - fusion: fuse successive datatype descriptions in order to minimize the number of resulting memcpy during pack/unpack. Fixes at the OMPI datatype level including: - Fix the create_hindexed and vector creation. - Fix the handling of [get\|set]_elements and _count. - Correctly compute the dispacement for block indexed types. - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:35:07 -04:00
George Bosilca	f68b06e9ee	Fix incorrect behavior with length == 0 Fixes #6575. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:33:28 -04:00
Howard Pritchard	e547a2b94d	Merge pull request #6838 from ggouaillardet/topic/v4.0.x/misc_fortran_bindings v4.0.x: misc Fortran related backports	2019-08-02 13:00:31 -06:00
Howard Pritchard	31aa52f11a	Merge pull request #6846 from nysal/topic/v4.0.x/ucx_accumulate_fix v4.0.x: osc/ucx: Fix data corruption with non-contiguous accumulates	2019-08-02 12:43:40 -06:00
Nysal Jan K.A	359cdf2b53	osc/ucx: Fix data corruption with non-contiguous accumulates Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com> (cherry picked from commit `3529d44702`)	2019-07-26 14:41:08 +05:30
Mikhail Brinskii	b9998a14dc	COLL/TUNED: Minor var names/comments fixes Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit `65618f8db8`)	2019-07-26 11:29:12 +03:00
Mikhail Brinskii	3d5b7b4a1b	COLL/TUNED: Update alltoall selection rule for mlx Use linear with sync alltoall algorithm for certain message/comm size ranges. Does not affect default fixed decision, unless HPCX (with its custom parameters) is used or corresponding mca is set. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit `404c480068`)	2019-07-26 11:28:47 +03:00
KAWASHIMA Takahiro	1ffb9b10bb	pcollreq/mpif-h: fix MPIX_Alltoallw_init() binding These issues were introduced in the recent commit `b71af0eca0`. This commit fixes Coverity CID 1451661 and 1451660. Though `c_info` part was an actual bug, the `c_sendtypes` part was not. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com> (cherry picked from commit open-mpi/ompi@facf8c5e98)	2019-07-24 17:12:10 +09:00
Gilles Gouaillardet	13ba2b0d75	pcollreq/mpif-h: fix MPIX_Alltoallw_init() binding Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@b71af0eca0)	2019-07-24 17:11:44 +09:00
Gilles Gouaillardet	5ab26e490a	fortran/mpif-h: fix [i]alltoallw bindings Fix a regression introduced in open-mpi/ompi@cdaed89d04 Fixes CID 1451610, 1451611 and 1451612 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@ed703bec1b)	2019-07-24 17:10:58 +09:00
Gilles Gouaillardet	fbf7d31fd1	fortran/mpif-h: fix MPI_[I]Alltoallw() binding - ignore sendcounts, sendispls and sendtypes arguments when MPI_IN_PLACE is used - use the right size when an inter-communicator is used. Thanks Markus Geimer for reporting this. Refs. open-mpi/ompi#5459 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@cdaed89d04)	2019-07-24 17:10:27 +09:00
Gilles Gouaillardet	aae73d9cf7	fortran/mpif-h: fix C to Fortran error code conversion - remove incorrect use of OMPI_INT_2_FINT() - use homogenous syntax (e.g. c_ierr = PMPI_...()) Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@223e6cc537)	2019-07-24 17:10:00 +09:00
Howard Pritchard	667aba9913	Merge pull request #6810 from janjust/v4.0.x v4.0.x OSC: Reset external request to NULL	2019-07-23 09:05:03 -06:00
Tomislav Janjusic	63605fc466	v4.0.x OSC: Reset external request to NULL to avoid double request completion Co-authored with Artem Polyakov <artemp@mellanox.com> Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>	2019-07-12 22:49:34 +03:00
Gilles Gouaillardet	c9e4240e70	mpi: retain operation and datatype in non blocking collectives MPI standard states a user MPI_Op and/or user MPI_Datatype can be free'd after a call to a non blocking collective and before the non-blocking collective completes. Retain user (only) MPI_Op and MPI_Datatype when the non blocking call is invoked, and set a request callback so they are free'd when the MPI_Request completes. Thanks Thomas Ponweiser for reporting this Fixes open-mpi/ompi#2151 Fixes open-mpi/ompi#1304 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@0fe756d416)	2019-07-12 10:27:04 +09:00
Aurelien Bouteiller	9499dcfe41	Manage errors in NBC collective ops Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu> Correctly bubble up errors in NBC collective operations Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu> The error field of requests needs to be rearmed at start, not at create Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu> (cherry picked from commit open-mpi/ompi@65660e5999)	2019-07-12 10:26:08 +09:00
Nysal Jan K.A	b6da090090	pml/ucx: Fix the max tag and context id values Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com> (cherry picked from commit `fe4ef147f8`)	2019-07-03 16:38:07 +03:00
Geoff Paulsen	514e273968	Merge pull request #6770 from devreal/osc_winalloc_err_v4.0.x OSC rdma win allocate: propagate errors to avoid deadlocks (v4.0.x)	2019-06-28 14:04:12 -05:00
Howard Pritchard	6424857029	Merge pull request #6634 from jsquyres/pr/v4.0.x/ob1-fixes v4.0.x: Cherry pick ob1 fixes from master	2019-06-26 10:49:32 -06:00
Harald Klimach	16e1d74c8f	Suggestion to fix division by zero in file view. In common_ompi_aggregators calc_cost routine: do not cast the real division to an int intermediately. This patch removes the obsolete int variable c and assigns the result of the P_a/P_x division directly to n_as. With the intermediate int c variable, n_as gets 0 if P_a < P_x, resulting in a division by 0 when computing n_s. Signed-off-by: Harald Klimach <harald.klimach@uni-siegen.de> (cherry picked from commit `e222a04ae5`)	2019-06-25 09:29:08 -06:00
Howard Pritchard	28d300915f	Merge pull request #6725 from bosilca/cherrypick/6683 Cherrypick/6683	2019-06-24 13:24:02 -06:00
Joseph Schuchart	c5cf3432b9	OSC rdma win allocate: synchronize error codes across shared memory group Signed-off-by: Joseph Schuchart <schuchart@hlrs.de> (cherry picked from commit `8f27cc26d9`)	2019-06-24 17:49:26 +02:00
Howard Pritchard	73c4aac12d	Merge pull request #6750 from brminich/topic/all2all_linear_sync_fix_v4.0 COLL/BASE: Fix linear sync all2all - v4.0.x	2019-06-17 13:45:52 -06:00
Howard Pritchard	cb8dd569ff	Merge pull request #6747 from devreal/rdma-fetchop-local-v4.0.x OSC rdma: make sure accumulating in shared memory is safe	2019-06-13 18:55:53 -06:00
Mikhail Brinskii	adba7f55f7	COLL/BASE: Fix linear sync all2all Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit `79006f4e5a`)	2019-06-09 21:31:19 +03:00
Joseph Schuchart	900f0fa21f	OSC rdma: make sure accumulating in shared memory is safe Signed-off-by: Joseph Schuchart <schuchart@hlrs.de> (cherry picked from commit `c67e229193`)	2019-06-07 12:45:00 +02:00
Tsubasa Yanagibashi	5dd8830dca	mpiext/pcollreq: Add `_f08` to procedure names The procedure names don't contain "_f08" of Fortran 2008 bindings of Persistent Collective Operations(mpiext/pcollreq/use-mpi-f08). This fix adds "_f08" to the procedure names of pcollreq/use-mpi-f08, same as other Fortran 2008 routines in `ompi/mpi/fortran/use-mpi-f08/mod`. Signed-off-by: Tsubasa Yanagibashi <fj2505dt@aa.jp.fujitsu.com> (cherry picked from commit `3148b0cfaa`)	2019-06-07 10:59:01 +09:00
Geoff Paulsen	a04f5f0c70	Merge pull request #6692 from vspetrov/v4.0.x V4.0.x Coll/hcoll: don't init opal memhooks unless explicitely requested	2019-06-03 15:00:36 -05:00
George Bosilca	a8d5da67db	Fix the man pages for some of the MPI_T_* functions. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-31 00:19:14 -04:00
George Bosilca	dbf89404d7	Fix the SPC initialization. Use the PVAR ctx to save the SPC index, so that no lookup nor restriction on the SPC vars position is imposed. Make sure the PVAR are always registered. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-31 00:19:14 -04:00
George Bosilca	cadf315ca9	Fixed SPC/MPI_T initialization error. Signed-off-by: Yong Qin <yongq@mellanox.com>	2019-05-30 17:54:26 -04:00
Howard Pritchard	e78851a6c7	Merge pull request #6704 from edgargabriel/pr/v4.0.x-empty-fileview-fix common/ompio: fix division by zero problem with empty fview	2019-05-26 09:45:52 -06:00
Howard Pritchard	386ed07d54	Merge pull request #6689 from hoopoepg/topic/suppressed-pml-ucx-mt-warning-v4.0 PML/UCX: disable PML UCX if MT is requested but not supported - v4.0	2019-05-26 09:44:05 -06:00
Edgar Gabriel	c7250cd11d	common/ompio: fix division by zero problem with empty fview When using an empty fileview, a division by zero bug can occur in ompio. Not entirely sure why the problem did not show up previously, but some recent changes trigger that bug in one of our tests. This pr is part of a fix applied in commit `f6b3a0a` Fixes Issue #6703 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-23 13:48:57 -05:00
Valentin Petrov	8f82c899bc	Coll/hcoll: don't init opal memhooks unless explicitely requested by user If user sets HCOLL_EXTERNAL_UCM_EVENTS=1 then we try init opal memory framework and register a mem release cb. Otherwise, rely on ucx. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-05-20 14:00:50 +03:00
Sergey Oblomov	1edd36638b	PML/UCX: disable PML UCX if MT is requested but not supported - in case if multithreading requested but not supported disable PML UCX Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit `a3578d9ece`)	2019-05-20 09:59:59 +03:00
Yossi Itigin	4f9fb3e9ce	OSC/UCX: Fix deadlock with atomic lock Atomic lock must progress local worker while obtaining the remote lock, otherwise an active message which actually releases the lock might not be processed while polling on local memory location. (picked from master `9d1994b`) Signed-off-by: Yossi Itigin <yosefe@mellanox.com>	2019-05-20 09:54:01 +03:00
George Bosilca	4946570b24	Remove few warnings identified by @rhc in #5514 . Signed-off-by: George Bosilca <bosilca@icl.utk.edu> (cherry picked from commit open-mpi/ompi@6d11a45f44)	2019-05-11 16:38:31 +09:00
Geoff Paulsen	73f9bcc374	Merge pull request #6632 from brminich/topic/shmem_all2all_put_4.0.x SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h 4.0.x	2019-05-07 08:05:01 -05:00
Howard Pritchard	8e968f16a6	Merge pull request #6626 from ggouaillardet/topic/v4.0.x/mpi_combiner_xyz_integer v4.0.x: mpi: mark MPI_COMBINER_{HVECTOR,HINDEXED,STRUCT}_INTEGER removed	2019-05-04 07:25:40 -06:00
George Bosilca	48f824327c	Fix the leak of fragments for persistent sends. The rdma_frag attached to the send request was not correctly released upon request completion, leaking until MPI_Finalize. A quick solution would have been to add RDMA_FRAG_RETURN at different locations on the send request completion, but it would have unnecessarily made the sendreq completion path more complex. Instead, I added the length to the RDMA fragment so that it can be completed during the remote ack. Be more explicit on the comment. The rdma_frag can only be freed once when the peer forced a protocol change (from RDMA GET to send/recv). Otherwise the fragment will be returned once all data pertaining to it has been trasnferred. NOTE: Had to add a typedef for "opal_atomic_size_t" from master into opal/threads/thread_usage.h into this cherry pick (it is in opal/include/opal_stdatomic.h on master, but that file does not exist here on the v4.0.x branch). Signed-off-by: George Bosilca <bosilca@icl.utk.edu> (cherry picked from commit `a16cf0e4dd`) Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-05-03 06:20:02 -07:00
Brelle Emmanuel	c44821aef5	pml/ob1: fixed local handle sent during PUT control message In case of using a btl_put in ob1, the handle of the locally registered memory is sent with a PUT control message. In the current master code the sent handle is necessary the handle in the frag but if the handle has been successfully registered in the request, the frag structure does not have any valid handle and all fragments use the request one. I suggest to check if the handle in the fragment is valid and if not to send the handle from the request. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net> (cherry picked from commit `e630046a4b`)	2019-05-03 05:53:35 -07:00
Mikhail Brinskii	e4ee56d1f3	SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h The new routine transfers the data asynchronously from the source PE to all PEs in the OpenSHMEM job. The routine returns immediately. The source and target buffers are reusable only after the completion of the routine. After the data is transferred to the target buffers, the counter object is updated atomically. The counter object can be read either using atomic operations such as shmem_atomic_fetch or can use point-to-point synchronization routines such as shmem_wait_until and shmem_test. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit `2ef5bd8b36`)	2019-05-02 21:25:59 +03:00
Howard Pritchard	3cafd02c7f	Merge pull request #6572 from markalle/v40x_fortran_macro in-place conversion macro writes into INPUT argument	2019-05-01 11:54:12 -06:00
Howard Pritchard	41ef5c7a10	Merge pull request #6594 from vspetrov/osc_ucx_rget_rkey_fix OSC/UCX: use correct rkey for atomic_fadd in rget/rput	2019-05-01 11:53:17 -06:00
Gilles Gouaillardet	e2638dbbf2	mpi: mark MPI_COMBINER_{HVECTOR,HINDEXED,STRUCT}_INTEGER removed unless configure'd with --enable-mpi1-compatibility This is a one-off commit for the v4.0.x branch since these symbols were simply removed from master. Thanks Lisandro Dalcin for reporting this. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-05-01 10:50:57 +09:00
Mark Allen	c081757462	fixing an unsafe usage of integer disps[] (romio321 gpfs) There are a couple MPI_Alltoallv calls in ad_gpfs_aggrs.c where the send/recv data comes from places like req[r].lens, and the send buffer and send displacements for example were being calculated as sbuf = pick one of the reqs: req[bottom].lens sdisps[r] = req[r].lens - req[bottom].lens which might be okay if the .lens was data inside of req[] so they'd all be close to each other. But each .lens field is just a pointer that's malloced, so those addresses can be all over the place, so the integer-sized sdisps[] isn't safe. I changed it to have a new extra array sbuf and rbuf for those two Alltoallv calls, and copied the data into the sbuf from the same locations it used to be setting up the sdisps[] at, and after the Alltoallv I copy the data out of the new rbuf into the same locations it used to be setting up the rdisps[] at. For what it's worth I was able to get this to fail -np 2 on a GPFS filesystem with hints romio_cb_write enable. I didn't whittle the test down to something small, but it was failing in an MPI_File_write_all call. Signed-off-by: Mark Allen <markalle@us.ibm.com> (cherry picked from commit `d85cac8f1a`)	2019-04-25 14:22:19 -04:00
Brelle Emmanuel	2a4bc0cb58	pml/ob1: fixed exit from get_frag_fail when falling back on btl_put In the case the btl_get fails Ob1 tries to fallback on btl_put first but the return code was ignored. So the code fell back on both btl_put and btl_send. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net> (cherry picked from commit `9c689f2225`)	2019-04-22 14:25:34 -07:00
Valentin Petrov	2947ab2dbc	OSC/UCX: correctly handle NULL origin addr and MPI_NO_OP Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-04-17 10:35:34 +03:00
Valentin Petrov	68c88e86f2	OSC/UCX: use correct rkey for atomic_fadd in rget/rput Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-04-16 15:24:57 +03:00
Thananon Patinyasakdikul	5999fdad5a	pml/ob1: fix deadlock with communicator flag ALLOW_OVERTAKE. We missed an assert to check if ALLOW_OVERTAKE is set or not before validating the sequence number and this will cause deadlock. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu> (cherry picked from commit `0263456cf4`)	2019-04-09 11:24:24 -07:00
Mark Allen	36583df689	in-place conversion macro writes into INPUT argument In fint_2_int.h there are some conversion macros for logicals. It has one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array would be allocated and the conversions then might expand to c_array[i] = (array[i] == 0 ? 0 : 1) and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it does things "in place", so the same conversion there would just be array[i] = (array[i] == 0 ? 0 : 1) The problem is some of the logical arrays being converted are INPUT arguments. And it's possible for some compilers to even put the argument in read-only memory so the above "in place" conversion SEGV's. A testcase I have used call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr) and gfortran put the second arg in read-only mem. In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg. remain_dims[] is for input only, but the file uses OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims); OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims); PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...); OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims); to convert it to c-ints make a C call then restore it to Fortran logicals before returning. It's not always wrong to convert purely in-place, eg cart_get_f.c has a periods[] that's exclusively for OUTPUT and it would be fine with the macros as they were. But I still say the macros are invalid because they don't distinguish whether they're being used on INPUT or OUTPUT args and thus they can't be used in a way that's legal for both cases. It might be possible to fix the macros by adding more of them so that cart_create_f.c and cart_get_f.c would use different macros that give more context. But my fix here is just to turn off the first block and make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT. The main macros that get enlarged by this change are define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now But these are only used in 4 places, three of which are the purpose of this checkin, to avoid the former in-place expansion of an INPUT arg: cart_create_f.c cart_map_f.c cart_sub_f.c and one of which is an OUPUT arg that was fine and that gets unnecessarily expanded into a separate array by this checkin. cart_get_f.c So I think an unnecessary malloc in cart_get_f.c is the only downside to this change, where the logicals array argument could have been used and converted in place. Signed-off-by: Mark Allen <markalle@us.ibm.com> Update provided by Gilles Gouaillardet to keep the in-place option if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit `0a7f1e3cc5`)	2019-04-05 13:34:09 -04:00
Howard Pritchard	702199f39e	Merge pull request #6545 from bertwesarg/v4.0.x-fix-cpp-condition Fix use of bitwise operation in CPP condition (v4.0.x)	2019-04-05 07:58:09 -06:00
James Clark	d8dc69feb5	Add a compilation flag that adds unwind info to all files that are present in the stack starting from MPI_Init. This is so when a debugger attaches using MPIR, it can step out of this stack back into main. This cannot be done with certain aggressive optimisations and missing debug information. Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Co-authored-by: Jeff Squyres <jsquyres@cisco.com> (cherry-picked from `20f5840`)	2019-04-01 11:10:04 +01:00
Bert Wesarg	7f65e5b720	Fix use of bitwise operation in CPP condition Signed-off-by: Bert Wesarg <bert.wesarg@tu-dresden.de> (cherry picked from commit `18525ce39b`)	2019-03-29 10:17:09 +01:00
Sergey Oblomov	14c271f993	PML/SPML/UCX: added evaluation of mmap events - there was a set of UCX related issues reported which caused by mmap API hooks conflicts. We added diagnostic of such problems to simplify bug-resolving pipeline Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit `d8e3562bae`)	2019-03-14 16:48:25 +02:00
Austen Lauria	8138cdbb49	Fix integer overflows with indexed datatype creation. The types of count, disp, and extent passed into ompi_datatype_add() should be size_t, ptrdiff_t and ptrdiff_t, respectively. This prevents integer overflows and errors in computing the size of large indexed datatypes. Signed-off-by: Austen Lauria <awlauria@us.ibm.com> (cherry picked from commit `b61e6242d3`) Signed-off-by: Austen Lauria <awlauria@us.ibm.com>	2019-03-13 14:20:26 -04:00
Howard Pritchard	5f7454a224	ompi_info: report whether MPI1 compat is enabled Its so easy to misspell compatability (sic) that we need to have ompi_info help us out. Related to #6470 Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit a5ba48c21839e0aab4c96afa97466a10f8bdc721)	2019-03-11 13:13:29 -06:00
Bert Wesarg	73134ab9e7	v4.0.x: Allow user to overwrite `OMPI_ENABLE_MPI1_COMPAT` Follow-up to #6120. As mentioned in [1], it may be desirable to nevertheless get the hidden MPI 1 prototypes, for users who know what they are doing, i.e., the tools guys. @ggouaillardet mentioned in [2], that `-DOMPI_OMIT_MPI1_COMPAT_DECLS=0` should work, but it does not, as than we only get redefinition warnings. See [3]. This topic does not relate to master, as we can remove the actual symbols there, but here in v4.0.x land, the symbols are always there. [1] https://github.com/open-mpi/ompi/pull/6120#issuecomment-443104700 [2] https://github.com/open-mpi/ompi/pull/6120#issuecomment-443117892 [3] https://github.com/open-mpi/ompi/pull/6120#issuecomment-468962596 Signed-off-by: Bert Wesarg <bert.wesarg@tu-dresden.de>	2019-03-07 09:54:20 +01:00
Geoffrey Paulsen	6df6a3f4bc	mpi.h.in: Revamp MPI-1 removed function warnings Refs https://github.com/open-mpi/ompi/issues/6278. This commit is intended to be cherry-picked to v4.0.x and the following commit will ammend to this functionality for master's removal. Changes the prototypes for MPI removed functions in the following ways: There are 4 cases: 1) User wants MPI-1 compatibility (--enable-mpi1-compatibility) MPI_Address (and friends) are declared in mpi.h with deprecation notice 2) User does not want MPI-1 compatibility, and has a C11-capable compiler Declare an MPI_Address (etc.) macro in mpi.h, which will cause a compile-time error using _Static_assert C11 feature 3) User does not want MPI-1 compatibility, and does not have a C11-capable compiler, but the compiler supports error function attributes. Declare an MPI_Address (etc.) macro in mpi.h, which will cause a compile-time error using error function attribute. 4) User does not want MPI-1 compatibility, and does not have a C11-capable compiler, or a compiler that supports error function attributes. Do not declare MPI_Address (etc.) in mpi.h at all. Unless the user is compiling with something like -Werror, this will allow the user's code to compile. We are choosing this because it seems like a losing battle to make some kind of compile time error that is friendly to the user (and doesn't make it look like mpi.h itself is broken). On v4.0.x, this will allow the user code to both compile (albeit with a warning) and link (because the MPI_Address will be in the MPI library because we are preserving ABI back to 3.0.x). On master/v5.0.x, this will allow the user code to compile, but it will fail to link (because the MPI_Address symbol will not be in the MPI library). Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com> (cherry-picked from `3136a1706c`)	2019-02-27 08:25:23 -08:00
Howard Pritchard	056d7ad0a3	Merge pull request #6419 from hppritcha/topic/fix_pgi_usempif08_4.0.x fortran:use mpif08 fix for PGI linking	2019-02-25 15:54:15 -07:00
Geoff Paulsen	1920769946	Merge pull request #6423 from abouteiller/pr6417to4.0.x v4.x: Cart/Graph create would not run the next_cid algorithm	2019-02-22 16:25:38 -06:00
Aurelien Bouteiller	d6e8d51d5f	Cart/Graph create would not run the next_cid algorithm and create disjoint communicator with inconsistent cid. Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>	2019-02-22 15:11:56 -05:00
Howard Pritchard	6596277ee8	fortran:use mpif08 fix for PGI linking commit `c6070fd2e` broke building fortran bindings with PGI compilers. Turns out PGI compilers need to link in the *.o from a module file whether or not there are module subroutines defined or not in the module file. Related to #6411 Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit `266bc3aced`)	2019-02-22 11:47:40 -07:00
Howard Pritchard	7aeb65579b	Merge pull request #6395 from brminich/topic/ucx_net_waddr_4.0.x PML/UCX: Use net worker address for remote peers - v4.0.x	2019-02-21 20:29:47 -07:00
Mikhail Brinskii	1c514948f6	PML/UCX: Use net worker address for remote peers For remote node peers pack smaller worker address, which contains network device addresses only. This would reduce amount of OOB traffic during startup. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit `751d88192d`)	2019-02-21 16:58:20 +02:00
Howard Pritchard	83cb9ca51e	Merge pull request #6404 from ggouaillardet/topic/v4.0.x/osc_rdma_self osc/rdma: correctly handle communications to self	2019-02-20 09:53:50 -07:00
KAWASHIMA Takahiro	7b71369632	man: fix more typos in MPI_Win_attach man page Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com> [skip ci] bot:notest (cherry picked from commit open-mpi/ompi@7095ad10a5)	2019-02-20 13:26:48 +09:00
Gilles Gouaillardet	3ab227df30	man: fix typos in MPI_Win_{attach,detach} man pages no code change [skip ci] bot:notest Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@7c0596819b)	2019-02-20 13:25:12 +09:00
Gilles Gouaillardet	749f51845b	osc/rdma: correctly handle communications to self mark the "self" peer OMPI_OSC_RDMA_PEER_LOCAL_BASE when the window is dynamically created and use_cpu_atomics is set in order to correctly handle communications to self. Thanks Bart Janssens for reporting this issue. Refs. open-mpi/ompi#6394 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (back-ported from commit open-mpi/ompi@fe05fcc11a)	2019-02-20 13:06:05 +09:00
Howard Pritchard	40db950c7d	Merge pull request #6340 from jsquyres/pr/v4.0.x/make-mpi.h-a-little-friendlier-to-c++ v4.0.x: mpi.h.in: use C++ static_cast<> where appropriate	2019-02-14 17:06:47 -07:00
Howard Pritchard	d2745ad0ad	Merge pull request #6327 from ggouaillardet/topic/v4.0.x/op ompi/op: fix support of non predefined datatypes with predefined oper…	2019-02-14 17:05:32 -07:00
Howard Pritchard	0b915b7e56	Merge pull request #6333 from jsquyres/pr/v4.0.x/hwloc-macro-conflict-fixes v4.0.x: Various minor hwloc cleanups	2019-02-12 09:13:19 -07:00
Howard Pritchard	5dd63405ce	Merge pull request #6368 from jsquyres/pr/v4.0.x/fix-ofi-configury v4.0.x: fix OFI configury	2019-02-11 13:15:52 -07:00
Howard Pritchard	8552d0e608	Merge pull request #6330 from ggouaillardet/topic/v4.0.x/ompi_datatype_set_args ompi/datatype: fix how we compute the space needed for the args	2019-02-08 14:44:08 -07:00
Jeff Squyres	9ad871fc38	ofi: revamp OPAL_CHECK_OFI configury Update the OPAL_CHECK_OFI configury macro: - Make it safe to call the macro multiple times: - The checks only execute the first time it is invoked - Subsequent invocations, it just emits a friendly "checking..." message so that configure output is sensible/logical - With the goal of ultimately removing opal/mca/common/ofi, rename the output variables from OPAL_CHECK_OFI to be opal_ofi_{happy\|CPPFLAGS\|LDFLAGS\|LIBS}. - Update btl/usnic and mtl/ofi for these new conventions. - Also, don't use AC_REQUIRE to invoke OPAL_CHECK_OFI because that causes the macro to be invoked at a fairly random time, which makes configure stdout confusing / hard to grok. - Remove a little left-over kruft in OPAL_CHECK_OFI, too (which resulted in an indenting change, making the change to opal_check_ofi.m4 look larger than it really is). Thanks Alastair McKinstry for the report and initial fix. Thanks Rashika Kheria for the reminder. Updated from master cherry pick: the OFI BTL does not exist on the v4.0.x branch. Therefore, did not include the OFI BTL changes on master in this cherry pick. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit `f5e1a672cc`)	2019-02-07 06:36:35 -08:00
Jeff Squyres	c39426ec91	mpi.h.in: use C++ static_cast<> where appropriate When compiling mpi.h with a modern C++ compiler and a high degree of pickyness (e.g., -Wold-style-cast), casting using (void) in the OMPI_PREDEFINED_GLOBAL and MPI_STATUS_IGNORE macros will emit warnings. So if we're compiling with a C++ compiler, use C++'s static_cast<> instead of (void*). Thanks to @shadow-fax for identifying the issue. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit `30afdcead9`)	2019-01-31 04:16:07 -08:00
René Widera	e30e5b95c6	common/ompio: possible rounding issue Similar to #6286 rounding number of bytes into a single precision floating point value to round up the result of a division is a potential risk due to rounding errors. - remove floating point operations for `round up` - removes floating point conversion for round down (native behavior of integer division) Signed-off-by: René Widera <r.widera@hzdr.de> (cherry picked from commit `a91fab80a1`)	2019-01-30 12:31:39 -06:00
Edgar Gabriel	d1e8779fe3	common/ompio: fix a floating point division problem This commit fixes a problem reported on the mailing list with individual writes larger than 512 MB. The culprit is a floating point division of two large, close values. Changing the datatypes from float to double (which is what is being used in the fcoll components) fixes the problem. See issue #6285 and https://forum.hdfgroup.org/t/cannot-write-more-than-512-mb-in-1d/5118 Thanks for Axel Huebl and René Widera for reporting the issue. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu> (cherry picked from commit `c0f8ce0fff`)	2019-01-30 12:31:16 -06:00
Gilles Gouaillardet	a247292275	topo/treematch: silence a hwloc related warning treematch/km_partitioning.c #include "config.h", but there is no such file when the embedded treematch is used. In order to prevent the embedded treematch from incorrectly using the config.h from the embedded hwloc, generate a dummy config.h. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit `0aeb27f776`)	2019-01-30 07:33:33 -05:00
Gilles Gouaillardet	fd157a960a	ompi/datatype: fix how we compute the space needed for the args Refs. open-mpi/ompi#6275 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@45fb69b2b9)	2019-01-30 11:01:11 +09:00
Gilles Gouaillardet	f76c81a758	ompi/op: fix support of non predefined datatypes with predefined operators ACCUMULATE, unlike REDUCE, can use with derived datatypes with predefinied operations, with some restrictions outlined in MPI-3:11.3.4. The derived datatype must be composed entierly from one predefined datatype (so you can do all the construction you want, but at the bottom, you can only use one datatype, say, MPI_INT). Refs. open-mpi/ompi#6275 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (back-ported from commit open-mpi/ompi@bc1cab5498)	2019-01-30 10:29:39 +09:00
Howard Pritchard	c9764f661b	Merge pull request #6263 from jsquyres/pr/v4.0.x/minor-fortran-valgrind-fix v4.0.x: mpi/fortran: Fix valgrind warnings for type create	2019-01-13 12:31:46 -07:00
Howard Pritchard	bc58e22b03	Merge pull request #6120 from gpaulsen/topic/v4.0.x/re-add-deprecated-oops v4.0.x: Re-add removed deprecate-only MPI-2.0 symbols	2019-01-09 20:10:02 -07:00
Risto Toijala	979b401936	mpi/fortran: Fix valgrind warnings for type create Valgrind warns that newtype is uninitialized when calling from Fortran as e.g. use mpi integer :: t, err call MPI_Type_create_f90_integer(5, t, err) Since newtype is intent(out), this should not happen. There is no reason to convert the type using PMPI_Type_f2c, only to over- write it immediately afterwards. The other type_create_ functions did not convert newtype. The valgrind warnings: ==28441== Conditional jump or move depends on uninitialised value(s) ==28441== at 0x581B555: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0) ==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0) ==28441== by 0x400BA1: MAIN__ (in [...]) ==28441== by 0x400C46: main (in [...]) ==28441== ==28441== Conditional jump or move depends on uninitialised value(s) ==28441== at 0x581B563: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0) ==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0) ==28441== by 0x400BA1: MAIN__ (in [..]) ==28441== by 0x400C46: main (in [...]) ==28441== ==28441== Use of uninitialised value of size 8 ==28441== at 0x581B577: PMPI_Type_f2c (in [...]/lib/libmpi.so.0.0.0) ==28441== by 0x4E87AB7: MPI_TYPE_CREATE_F90_INTEGER (in [...]/lib/libmpi_mpifh.so.0.0.0) ==28441== by 0x400BA1: MAIN__ (in [...]) ==28441== by 0x400C46: main (in [...]) ==28441== Signed-off-by: Risto Toijala <risto.toijala@gmail.com> (cherry picked from commit `f14a0f4fc9`)	2019-01-09 07:24:22 -08:00
Jeff Squyres	1a1a932acc	romio321: ensure to distribute ompi_grequestx.h Refs https://github.com/open-mpi/ompi/issues/6227. Thanks to George Marselis for reporting. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit `62321be186`)	2018-12-28 13:18:10 -08:00
Geoffrey Paulsen	4aa91e1ffb	Return MPI1 function implementations to build list Adding the implementations of the functions that were removed from the MPI standard to the build list, regardless of the state of the OMPI_ENABLE_MPI1_COMPAT. According to the README, we want the OMPI_ENABLE_MPI1_COMPAT configure flag to control which MPI prototypes are exposed in mpi.h, NOT, which are built into the mpi library. Those will remain in the mpi library until a future major release (5.0?) NOTE: for the Fortran implementations, we instead define OMPI_OMIT_MPI1_COMPAT_DECLS to 0 instead of OMPI_ENABLE_MPI1_COMPAT to 1. I'm not sure why, but this seems to work correctly. Also changing the removed MPI_Errhandler_create implementation to use the non removed MPI_Comm_errhandler_function prototype (prototype remains unchanged from MPI_Comm_errhandler_fn) NOTE: This commit is NOT a cherry-pick from master, because on master, we are no longer building those symbols by default, but on v4.0.x we _ARE_ still building these symbols by default. This is because the v4.0.x branch is to remain backwards compatible with v3.0.x, while at the same time removing the "removed" symbols from mpi.h (unless the user configures with --enable-mpi1-compatibility) Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2018-12-20 12:22:04 -06:00
Howard Pritchard	4be4282312	Merge pull request #6128 from ggouaillardet/topic/v4.0.x/mpiext_short_path mpiext: keep paths short	2018-12-17 13:22:19 -07:00

1 2 3 4 5 ...

10361 Коммитов