openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	a0fce4eac2	Fix the man pages for some of the MPI_T_* functions. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-29 00:23:35 -04:00
George Bosilca	eed770ce5c	Fix the SPC initialization. Use the PVAR ctx to save the SPC index, so that no lookup nor restriction on the SPC vars position is imposed. Make sure the PVAR are always registered. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-29 00:23:18 -04:00
George Bosilca	7dab8c002b	Fixed SPC/MPI_T initialization error. Signed-off-by: Yong Qin <yongq@mellanox.com>	2019-05-28 15:10:32 -04:00
Tomislav Janjusic	6ea920e225	Coll/hcoll: adding scatterv interface Signed-off-by: Valentin Petrov valentinp@mellanox.com	2019-05-27 12:27:43 +03:00
Edgar Gabriel	8eda9f2ecd	common/ompio: fix coverty warnings this commmit fixes coverty warnings CID 1445198 and CID 1445197 For a reason that is a bit unclear to me, coverty only complained about the read files, but the write operations had the same issue, so I fixed that within the same commit as well. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-23 13:40:39 -05:00
Edgar Gabriel	27b2ec71a7	common/ompio: add support for read operations and collective I/O external32 data representation is now support by ompio for everything but non-blocking collective I/O operations. The support can further be improved in a second step to limit the temporary buffer size (at least for blocking operations), but it does work now for many scenarios. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 17:56:16 -05:00
Edgar Gabriel	ab56e6f0db	common/ompio: make individual read operations work. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 17:22:33 -05:00
Edgar Gabriel	f6b3a0af52	common/ompio: individual write of external32 works both blocking and non-blocking. collective write and read operations not yet. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 16:26:14 -05:00
Edgar Gabriel	d955753cb8	common/ompio: abstraction for different convertor types introduce separate convertors for memory vs. file representation. Adjust the interfaces for decode_datatype to provide the convertor to be used for that. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 13:35:38 -05:00
Edgar Gabriel	35be18b266	common/ompio: rename ompio_cuda* to ompio_buffer* the infrastructure put in place to manage cuda buffers is actually a lot more generic than just for cuda buffers. Specifically, we ca reuse much of the code to implement the external32 data representation. This commit converts the code from common_ompio_cuda* to common_ompio_buffer*. There are just very few places where we actually need to keep the OPAL_CUDA_SUPPORT ifdef in place. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 12:50:04 -05:00
Edgar Gabriel	a96efb7620	common/ompio: add comm_ompio_read_all/write_all functions in preparation for adding support for the external32 data representation. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-05-20 12:49:36 -05:00
Valentin Petrov	f19f6f432a	Coll/hcoll: don't init opal memhooks unless explicitely requested by user If user sets HCOLL_EXTERNAL_UCM_EVENTS=1 then we try init opal memory framework and register a mem release cb. Otherwise, rely on ucx. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-05-20 11:17:44 +03:00
Yossi Itigin	9d1994b906	OSC/UCX: Fix deadlock with atomic lock Atomic lock must progress local worker while obtaining the remote lock, otherwise an active message which actually releases the lock might not be processed while polling on local memory location. Signed-off-by: Yossi Itigin <yosefe@mellanox.com>	2019-05-19 20:10:09 +03:00
Sergey Oblomov	a3578d9ece	PML/UCX: disable PML UCX if MT is requested but not supported - in case if multithreading requested but not supported disable PML UCX Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-05-17 11:25:23 +03:00
bosilca	6089608858	Merge pull request #6647 from bosilca/fix/length_0 Fix/length 0	2019-05-14 17:59:15 -04:00
Jeff Squyres	9442989e2c	Merge pull request #6382 from jsquyres/pr/ofi-mtl-gitignore mtl/ofi: add a .gitignore	2019-05-13 12:00:41 -04:00
George Bosilca	42119254c7	Fix incorrect behavior with length == 0 Fixes #6575. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-10 19:53:34 -04:00
George Bosilca	d141bf7912	Update the datatype dump to match the actual types. Update the comments to better reflect what is going on. Minor indentations. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-10 18:03:57 -04:00
Nathan Hjelm	4345308dfd	osc/rdma: fix CAS 32-bit network atomic compatibility check When checking for btl compatibility with 32-bit CAS osc/rdma was checking the incorrect flag field. Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>	2019-05-10 07:27:53 -06:00
KAWASHIMA Takahiro	dabad084b5	Merge pull request #6621 from bosilca/topic/persistent_req_leak Fix the leak of fragments for persistent sends (issue #6565)	2019-05-03 15:21:42 +09:00
George Bosilca	a16cf0e4dd	Fix the leak of fragments for persistent sends. The rdma_frag attached to the send request was not correctly released upon request completion, leaking until MPI_Finalize. A quick solution would have been to add RDMA_FRAG_RETURN at different locations on the send request completion, but it would have unnecessarily made the sendreq completion path more complex. Instead, I added the length to the RDMA fragment so that it can be completed during the remote ack. Be more explicit on the comment. The rdma_frag can only be freed once when the peer forced a protocol change (from RDMA GET to send/recv). Otherwise the fragment will be returned once all data pertaining to it has been trasnferred. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-05-02 09:40:11 -04:00
Jeff Squyres	ac54d771ec	mtl/ofi: add a .gitignore Ignore generated files. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-05-01 14:00:00 -07:00
Yossi Itigin	5d2200a7d6	Merge pull request #6605 from brminich/topic/shmem_all2all_put SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h	2019-05-01 12:00:21 +03:00
bosilca	399b7133ab	Merge pull request #6556 from EmmanuelBRELLE/PR_fix_local_handle_in_PUT_message pml/ob1: fixed local handle sent during PUT control message	2019-04-27 13:51:22 -04:00
Mikhail Brinskii	2ef5bd8b36	SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h The new routine transfers the data asynchronously from the source PE to all PEs in the OpenSHMEM job. The routine returns immediately. The source and target buffers are reusable only after the completion of the routine. After the data is transferred to the target buffers, the counter object is updated atomically. The counter object can be read either using atomic operations such as shmem_atomic_fetch or can use point-to-point synchronization routines such as shmem_wait_until and shmem_test. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>	2019-04-26 14:47:58 +03:00
Mark Allen	d85cac8f1a	fixing an unsafe usage of integer disps[] (romio321 gpfs) There are a couple MPI_Alltoallv calls in ad_gpfs_aggrs.c where the send/recv data comes from places like req[r].lens, and the send buffer and send displacements for example were being calculated as sbuf = pick one of the reqs: req[bottom].lens sdisps[r] = req[r].lens - req[bottom].lens which might be okay if the .lens was data inside of req[] so they'd all be close to each other. But each .lens field is just a pointer that's malloced, so those addresses can be all over the place, so the integer-sized sdisps[] isn't safe. I changed it to have a new extra array sbuf and rbuf for those two Alltoallv calls, and copied the data into the sbuf from the same locations it used to be setting up the sdisps[] at, and after the Alltoallv I copy the data out of the new rbuf into the same locations it used to be setting up the rdisps[] at. For what it's worth I was able to get this to fail -np 2 on a GPFS filesystem with hints romio_cb_write enable. I didn't whittle the test down to something small, but it was failing in an MPI_File_write_all call. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2019-04-23 16:01:55 -04:00
Jeff Squyres	9a9d106296	Merge pull request #6555 from EmmanuelBRELLE/PR-pmlob1_fix_rc_for_putfrag_when_get_failed pml/ob1: fixed exit from get_frag_fail when falling back on btl_put	2019-04-22 17:19:12 -04:00
Gilles Gouaillardet	251477c518	Merge pull request #6431 from ggouaillardet/topic/mpiext_nolib mpiext/shortfloat: do not create empty libraries	2019-04-22 11:23:19 +09:00
Edgar Gabriel	c80a842036	Merge pull request #6602 from edgargabriel/topic/io_array_refactor common/ompio: refactor the build_io_array function	2019-04-18 13:44:48 -05:00
Gilles Gouaillardet	e1098dae4b	mpiext/shortfloat: do not build an empty library the shortfloat extension is only made of header files, and hence do not require a library to be built. Refs. open-mpi/ompi#6205 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2019-04-18 13:42:18 -04:00
Gilles Gouaillardet	e70780b762	configury: allow mpi extensions with no libraries Do not require an archive when the OMPI_MPIEXT_<ext>_HAVE_OBJECT macro is defined to 0. See `ompi/mpiext/example/configure.m4`. Allow some extensions to be built on OS X since the creation of archives with no files is not permitted. Refs. open-mpi/ompi#6205 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-04-18 13:42:01 -04:00
Gilles Gouaillardet	232055fc7a	fortran/use-mpi-f08: fix intent of the internal ompi_*_f bindings Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-04-18 13:29:19 +09:00
Edgar Gabriel	d43427fc76	common/ompio: refactor the build_io_array function abstract out the io_array structure to be used in common_ompio_build_io_array function. This is preparation for a future component that would like to use the same function, but not modify the io_array stored on the file handle itself. Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2019-04-17 14:42:33 -05:00
Valentin Petrov	30970bdfdf	OSC/UCX: correctly handle NULL origin addr and MPI_NO_OP Addtional bugfix: origin_addr -> result_addr for no_op, replace_op and sum_op fetch destination. Signed-off-by: Valentin Petrov <valentinp@mellanox.com>	2019-04-17 10:30:21 +03:00
bosilca	8cf7a7e87d	Merge pull request #6538 from bosilca/topic/issue6522 Prevent a segfault when accessing a rank outside a communicator.	2019-04-09 18:08:49 -04:00
David Eberius	461d8bc77b	Fixed a potential name collision. Signed-off-by: David Eberius <deberius@vols.utk.edu>	2019-04-03 16:43:48 -04:00
markalle	98fdeeeb41	Merge pull request #6448 from markalle/macro_writing_input_arg in-place conversion macro writes into INPUT argument	2019-04-02 11:33:18 -05:00
Brelle Emmanuel	e630046a4b	pml/ob1: fixed local handle sent during PUT control message In case of using a btl_put in ob1, the handle of the locally registered memory is sent with a PUT control message. In the current master code the sent handle is necessary the handle in the frag but if the handle has been successfully registered in the request, the frag structure does not have any valid handle and all fragments use the request one. I suggest to check if the handle in the fragment is valid and if not to send the handle from the request. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:45:05 +02:00
Brelle Emmanuel	9c689f2225	pml/ob1: fixed exit from get_frag_fail when falling back on btl_put In the case the btl_get fails Ob1 tries to fallback on btl_put first but the return code was ignored. So the code fell back on both btl_put and btl_send. Signed-off-by: Brelle Emmanuel <emmanuel.brelle@atos.net>	2019-04-01 18:17:10 +02:00
Mark Allen	0a7f1e3cc5	in-place conversion macro writes into INPUT argument In fint_2_int.h there are some conversion macros for logicals. It has one path for OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT where a new array would be allocated and the conversions then might expand to c_array[i] = (array[i] == 0 ? 0 : 1) and another path for OMPI_SIZEOF_FORTRAN_LOGICAL == SIZEOF_INT where it does things "in place", so the same conversion there would just be array[i] = (array[i] == 0 ? 0 : 1) The problem is some of the logical arrays being converted are INPUT arguments. And it's possible for some compilers to even put the argument in read-only memory so the above "in place" conversion SEGV's. A testcase I have used call MPI_CART_SUB(oldcomm, (/.true.,.false./), newcomm, ierr) and gfortran put the second arg in read-only mem. In cart_sub_f.c you can trace the ompi_fortran_logical_t *remain_dims arg. remain_dims[] is for input only, but the file uses OMPI_LOGICAL_ARRAY_NAME_DECL(remain_dims); OMPI_ARRAY_LOGICAL_2_INT(remain_dims, ndims); PMPI_Cart_sub(..., OMPI_LOGICAL_ARRAY_NAME_CONVERT(remain_dims), ...); OMPI_ARRAY_INT_2_LOGICAL(remain_dims, ndims); to convert it to c-ints make a C call then restore it to Fortran logicals before returning. It's not always wrong to convert purely in-place, eg cart_get_f.c has a periods[] that's exclusively for OUTPUT and it would be fine with the macros as they were. But I still say the macros are invalid because they don't distinguish whether they're being used on INPUT or OUTPUT args and thus they can't be used in a way that's legal for both cases. It might be possible to fix the macros by adding more of them so that cart_create_f.c and cart_get_f.c would use different macros that give more context. But my fix here is just to turn off the first block and make all paths run as if OMPI_SIZEOF_FORTRAN_LOGICAL != SIZEOF_INT. The main macros that get enlarged by this change are define OMPI_ARRAY_LOGICAL_2_INT_ALLOC : mallocs now define OMPI_ARRAY_LOGICAL_2_INT : also mallocs now But these are only used in 4 places, three of which are the purpose of this checkin, to avoid the former in-place expansion of an INPUT arg: cart_create_f.c cart_map_f.c cart_sub_f.c and one of which is an OUPUT arg that was fine and that gets unnecessarily expanded into a separate array by this checkin. cart_get_f.c So I think an unnecessary malloc in cart_get_f.c is the only downside to this change, where the logicals array argument could have been used and converted in place. Signed-off-by: Mark Allen <markalle@us.ibm.com> Update provided by Gilles Gouaillardet to keep the in-place option if OMPI_FORTRAN_VALUE_TRUE == 1 where no conversion is needed. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2019-04-01 10:38:05 -04:00
KAWASHIMA Takahiro	63a1968459	man: Fix typo of MPI_TYPE_GET_NAME Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2019-03-29 13:01:52 +09:00
Jeff Squyres	05c5e2034b	Merge pull request #6527 from James-A-Clark/master Add compilation flag to allow unwinding through files that are present in the stack when attaching with MPIR	2019-03-28 18:16:02 -04:00
George Bosilca	6ea0c4eab9	Prevent a segfault when accessing a rank outside a communicator. This is not fixing any issue, it is simply preventing a sefault if the communicator creation has not happened as expected. Thus, this code path should never really be hit in a correct MPI application with a valid communicator creation support. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-03-28 12:03:29 -04:00
Jeff Squyres	3c1b33c93a	Merge pull request #6140 from bertwesarg/fix-cpp-condition Fix use of bitwise operation in CPP condition	2019-03-28 10:06:20 -04:00
James Clark	20f5840cbb	Add a compilation flag that adds unwind info to all files that are present in the stack starting from MPI_Init. This is so when a debugger attaches using MPIR, it can step out of this stack back into main. This cannot be done with certain aggressive optimisations and missing debug information. Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Co-authored-by: Jeff Squyres <jsquyres@cisco.com>	2019-03-27 14:32:15 +00:00
Ralph Castain	dfbc14430d	Merge pull request #6440 from ggouaillardet/topic/yield_when_idle schizo/ompi: correctly handle the yield_when_idle option	2019-03-25 12:17:34 -07:00
Artem Polyakov	bfff5783f9	Merge pull request #6371 from artpol84/osc/select_dbg osc/base: Add debug output stating a selected component	2019-03-22 22:24:04 -07:00
Yossi Itigin	9b91cf09cc	Merge pull request #6481 from hoopoepg/topic/check-ucx-params PML/SPML/UCX: added evaluation of mmap events	2019-03-14 11:53:42 +02:00
Austen Lauria	b61e6242d3	Fix integer overflows with indexed datatype creation. The types of count, disp, and extent passed into ompi_datatype_add() should be size_t, ptrdiff_t and ptrdiff_t, respectively. This prevents integer overflows and errors in computing the size of large indexed datatypes. Signed-off-by: Austen Lauria <awlauria@us.ibm.com>	2019-03-13 09:39:57 -04:00
Sergey Oblomov	d8e3562bae	PML/SPML/UCX: added evaluation of mmap events - there was a set of UCX related issues reported which caused by mmap API hooks conflicts. We added diagnostic of such problems to simplify bug-resolving pipeline Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2019-03-12 21:14:27 +02:00

1 2 3 4 5 ...

10444 Коммитов