openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	c6592822c0	btl/usnic: set retrans_timeout back down to 5ms Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 3080033a8c4db64199b03b6058e18488f619088c)	2019-10-15 07:54:32 -07:00
Jeff Squyres	1565239506	btl/usnic: set ack_iteration_delay default to 4 It was previously accidentally set to 0. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 132e4cab3bc71df0da87368a332d6af0090a6977)	2019-10-15 07:54:31 -07:00
Jeff Squyres	22bc268e6e	btl/usnic: properly size freelist items Move the prefix area from the head to the body in relevant size computations. This fixes a problem in high traffic situations where usNIC may have sent from unregistered memory. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit fe7f772f21627b01838c007db7cedbbb0ce8b536)	2019-10-04 16:47:19 -07:00
Jeff Squyres	58155bc760	btl/usnic: cap the number of resends per progress iteration New MCA param: btl_usnic_max_resends_per_iteration. This is the max number of resends we'll do in a single pass through usNIC component progress. This prevents progress from getting stuck in an endless loop of retransmissions (i.e., if more retransmissions are triggered during the sending of retransmissions). Specifically: we need to leave the resend loop to allow receives to happen (which may ACK messages we have sent previously, and therefore cause pending resends to be moot). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 27e3040dfeba00a9a2615a217c164899f0009e59)	2019-10-04 16:47:13 -07:00
Jeff Squyres	8f929c68f1	btl/usnic: increase default retrans_timeout Significantly increase the default retrans timeout. If the retrans timeout is too soon, we can end up in a retransmission storm where the logic will continually re-transmit the same frames during a single run through the usNIC progress function (because the timer for a single frame expires before we have run through re-transmitting all the frames pending re-transmission). Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 3cc95d86b2123f38f392e56adca7ac8a1fef6454)	2019-10-04 16:47:11 -07:00
Jeff Squyres	b5cb03450c	btl/usnic: clarifications and fixes regarding ACKs New MCA parameter: btl_usnic_ack_iteration_delay. Set this to the number of times through the usNIC component progress function before sending a standalone ACK (vs. piggy-backing the ACK on any other send going to the target peer). Use "ticks" language to clarify that we're really counting the number of times through the usNIC component DATA_CHANNEL completion check (to check for incoming messages) -- it has no relation to wall clock time whatsoever. Also slightly change the channel-checking scheme in usNIC component progress: only check the PRIORITY channel once (vs. checking it once, not finding anything, and then falling through the progress_2() where we check PRIORITY again and then check the DATA channel). As before, if our "progress" libevent fires, increment the tick counter enough to guarantee that all endpoints that need an ACK will get triggered to send standalone ACKs the next time through progress, if necessary. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit 968b1a51b59898877a8c7268d463d3d7d78d86a3)	2019-10-04 16:47:09 -07:00
Jeff Squyres	0839a9c313	btl/usnic: s/get_nsec/get_nticks/g Rename "get_nsec()" to "get_ticks()" to more accurately reflect that this function has no correlation to wall clock time at all. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> (cherry picked from commit ce2910a28aea61043b81324c67999f3a47cfe7ac)	2019-10-04 16:47:08 -07:00
Adrian Reber	674655c641	Do not use CMA in user namespaces Trying out to run processes via mpirun in Podman containers has shown that the CMA btl_vader_single_copy_mechanism does not work when user namespaces are involved. Creating containers with Podman requires at least user namespaces to be able to do unprivileged mounts in a container Even if running the container with user namespace user ID mappings which result in the same user ID on the inside and outside of all involved containers, the check in the kernel to allow ptrace (and thus process_vm_{read,write}v()), fails if the same IDs are not in the same user namespace. One workaround is to specify '--mca btl_vader_single_copy_mechanism none' and this commit adds code to automatically skip CMA if user namespaces are detected and fall back to MCA_BTL_VADER_EMUL. Signed-off-by: Adrian Reber <areber@redhat.com> (cherry picked from commit fc68d8a90fe86284e9dc730f878b55c0412f01d2)	2019-09-20 19:12:48 -07:00
Nathan Hjelm	5a945f668c	btl/vader: when using single-copy emulation fragment large rdma This commit changes how the single-copy emulation in the vader btl operates. Before this change the BTL set its put and get limits based on the max send size. After this change the limits are unset and the put or get operation is fragmented internally. References #6568 Signed-off-by: Nathan Hjelm <hjelmn@google.com> (cherry picked from commit ae91b11de2314ab11a9842d9738cd14f8f1e393b)	2019-09-17 20:01:37 -06:00
George Bosilca	e2b154327e	Small optimization on the datatype commit. This patch fixes the merge of contiguous elements into larger but more compact datatypes, and allows for contiguous elements to have thir blocklen increasing instead of the count. The idea is to always maximize the blocklen, aka. the contiguous part of the datatype. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> (cherry picked from commit 41e6f55807b01ad5c04e8387a3699cf743931f6a)	2019-09-03 15:09:33 -04:00
Geoff Paulsen	893ea3f91f	Merge pull request #6929 from rhc54/cmr40/pmix314 Remove unnecessary error log	2019-08-30 14:10:36 -05:00
Harumi Kuno	fbbacc1303	Fix mmap infinite recurse in memory patcher This commit fixes issue #6853 by removing MacOS/Darwin-specific logic from intercept_mmap. Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>	2019-08-29 18:03:16 -07:00
Ralph Castain	8efc6e1dc1	Remove unnecessary error log Refs https://github.com/pmix/pmix/pull/1413 Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-08-26 23:48:34 -07:00
Sergey Oblomov	66e18563bf	SPML/UCX: fixed hang in SHMEM_FINALIZE - used MPI _Barrier to synchronize processes Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit 182023febb6f8f31ce34dc54c8aa409ad7e44fa2)	2019-08-22 11:41:52 +03:00
Geoff Paulsen	390e0bc5b2	Merge pull request #6863 from bosilca/topic/backport_6695 Refresh of the datatype engine from Topic/backport 6695	2019-08-21 10:49:37 -05:00
George Bosilca	8e6e826b54	Fix the variable names used for the datatype dump. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-16 10:27:35 -04:00
George Bosilca	83d40c1e14	Fix the stack displacement. Fixes the convertor iovec description on the MPI-IO reported by Edgar. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-16 10:27:23 -04:00
Ralph Castain	167ca31a31	Update PMIx to official v3.1.4 release Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-08-09 13:14:48 -07:00
George Bosilca	f78d3d52cd	Optimize the pack/unpack. Start optimizing the code. This commit divides the operations in 2 parts, the first, outside the critical part, deals with partial blocks of predefined elements, and the second, inside the critical path, only deals with full blocks of elements. This reduces the number of expensive operations in the critical path and results in a decent performance increase. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:53 -04:00
George Bosilca	87299e0b1c	Get rid of the division in the critical path. Amazing how a bad instruction scheduling can have such a drastic impact on the code performance. With this change, the get a boost of at least 50% on the performance of data with a small blocklen and/or count. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:44 -04:00
George Bosilca	fad707d3b0	Rework the datatype commit. Optimize contiguous loops by collapsing them into a single element. During datatype optimization collapse similar elements into larger blocks. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:36 -04:00
George Bosilca	d5cdfe70ef	Optimize the position placement. Upon detecting a datatype loop representation skip the entire loop according the the remaining space. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:27 -04:00
George Bosilca	78cc0ff891	Disable checksum. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:19 -04:00
George Bosilca	012a004806	Clean and sync the pack and unpack functions. - optimize handling of contiguous with gaps datatypes. - fixes a performance issue for all datatypes with a count of 1. - optimize the pack/unpack of contiguous with gaps datatype. - optimize the case of blocklen == 1 Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:11 -04:00
George Bosilca	0a00b02e48	Small improvements on the test. Rework the to_self test to be able to be used as a benchmark. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:39:02 -04:00
George Bosilca	4cdc2155e5	Optimize the raw representation. Merge contiguous iov in order to minimize the number of returned iovec. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:38:52 -04:00
George Bosilca	8b794235b8	Update the datatype dump to match the actual types. Update the comments to better reflect what is going on. Minor indentations. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:37:47 -04:00
George Bosilca	4f754d0156	Optimized datatype description. Move toward a base type of vector (count, type, blocklen, extent, disp) with disp and extent applying toward the count repertition and blocklen being a contiguous memory of type type. Implement 2 optimizations on this description used during type_commit: - collapse: successive similar datatype descriptions are collapsed together with an increased count. - fusion: fuse successive datatype descriptions in order to minimize the number of resulting memcpy during pack/unpack. Fixes at the OMPI datatype level including: - Fix the create_hindexed and vector creation. - Fix the handling of [get\|set]_elements and _count. - Correctly compute the dispacement for block indexed types. - Support the MPI_LB and MPI_UB deprecation, aka. OMPI_ENABLE_MPI1_COMPAT. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2019-08-05 09:35:07 -04:00
Howard Pritchard	71f240f078	btl/openib: fix issue 6785 Commit d7053a3 broke things for the case when Open MPI 4.0.x is built without UCX support. Problem was it was trying to partially initialize the btl to try and delay printing of a help message till wireup. Well this sort of doesn't work in all cases. Rather than keep piling on changes to support a help message for a BTL that we are deprecating, take a keep it simple stupid approach. So, revert most of d7053a3 and instead put the help message back in the original location, during scan of ports of the available HCAs to check for whether or not link layer for that port is configured for ethernet or infiniband. If Open MPI was built with UCX support, don't emit the help message, if UCX was not linked in, emit the help message. Verified on a system with connectX5 HCAs configured with two ports configured for ethernet and two for infiniband. relates to #6785 Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2019-07-12 08:21:21 -06:00
Ralph Castain	1d0e0557b9	v4.0.x: Update PMIx to official v3.1.3 release Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-07-02 08:56:49 -07:00
Geoff Paulsen	7f26c6dc41	Merge pull request #6776 from rhc54/cmr40/pmix Update to PMIx v3.1.3rc4	2019-06-26 13:21:31 -05:00
Howard Pritchard	6424857029	Merge pull request #6634 from jsquyres/pr/v4.0.x/ob1-fixes v4.0.x: Cherry pick ob1 fixes from master	2019-06-26 10:49:32 -06:00
Ralph Castain	9d0adbc6bc	Update to track 32-bit support commit Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-06-26 09:31:43 -07:00
Ralph Castain	b353639573	Update to PMIx v3.1.3rc4 Will provide PR to update VERSION to final release once passes MTT Signed-off-by: Ralph Castain <rhc@pmix.org>	2019-06-25 13:45:19 -07:00
Ralph Castain	05fa5845bc	Fix finalize of flux component Per patches from @SteVwonder and @garlick Signed-off-by: Ralph Castain <rhc@pmix.org> (cherry picked from commit d4070d5f58f0c65aef89eea5910b202b8402e48b)	2019-06-19 06:00:02 -07:00
Nathan Hjelm	b5428aaf71	btl/uct: add support for UCX 1.6.x This commit updates the uct btl to support the v1.6.x release of UCX. This release breaks API. Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu> (cherry picked from commit b78066720c3e3299bd76f2e22d2c0e415db572fc) Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>	2019-06-07 15:54:47 -05:00
Geoff Paulsen	18f10377eb	Merge pull request #6152 from ggouaillardet/topic/v4.0.x/ucx_warning btl/openib: delay UCX warning to add_procs()	2019-06-03 15:09:43 -05:00
Howard Pritchard	6c74d4031b	Merge pull request #6720 from markalle/patcher_additions_v40x shmat/shmdt additions for patcher	2019-06-03 12:51:05 -07:00
Mark Allen	5f79dfaa0a	shmat/shmdt additions for patcher This is mostly based off recent UCX additions to their patcher: https://github.com/openucx/ucx/pull/2703 They added triggers for * mmap when (flags & MAP_FIXED) && (addr != NULL) * shmat when (shmflg & SHM_REMAP) && (shmaddr != NULL) Beyond that I noticed they already had a trigger for * madvise when (advice == MADV_FREE) that we didn't so I added that. And the other main thing is we didn't really have shmat/shmdt active for some systems because we only had a path for syscall(SYS_shmdt, ) but we needed to also have a path for syscall(SYS_ipc, IPCOP_shmdt, ) and same for shmat. Signed-off-by: Mark Allen <markalle@us.ibm.com> (cherry picked from commit eb888118e83f56c131aff900b03eab34c92b7805)	2019-05-30 13:31:02 -04:00
Nathan Hjelm	11cb0f24a5	btl/uct: check for support before disabling UCX memory hooks Signed-off-by: Nathan Hjelm <hjelmn@me.com> (cherry picked from commit 3e1dd362411f1da5564d3402f65e9b3b74f50759)	2019-05-20 16:42:38 -05:00
Sergey Oblomov	1944295da3	COMMON/UCX: removed ucs stuff Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit ebc457baf5ded5dd46cd73918a2f69555f408c54)	2019-05-17 09:58:20 +03:00
Sergey Oblomov	fa0a0b1597	COMMON/UCX: init memhooks infra on external hooks only - initialize memory hooks infrastructure only in case if external memory hooks are requested Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com> (cherry picked from commit a0a93060668cd11a783cc94c753efb3129df9dde)	2019-05-17 09:58:12 +03:00
George Bosilca	4946570b24	Remove few warnings identified by @rhc in #5514 . Signed-off-by: George Bosilca <bosilca@icl.utk.edu> (cherry picked from commit open-mpi/ompi@6d11a45f44)	2019-05-11 16:38:31 +09:00
Gilles Gouaillardet	70a864fce3	btl/vader: fix finalize sequence free the component mpool in mca_btl_vader_component_close() and after freeing soem objects that depend on it such as mca_btl_vader_component.vader_frags_user Thanks Christoph Niethammer for reporting this. Refs. open-mpi/ompi#6524 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@77060cad07)	2019-05-11 13:04:23 +09:00
George Bosilca	48f824327c	Fix the leak of fragments for persistent sends. The rdma_frag attached to the send request was not correctly released upon request completion, leaking until MPI_Finalize. A quick solution would have been to add RDMA_FRAG_RETURN at different locations on the send request completion, but it would have unnecessarily made the sendreq completion path more complex. Instead, I added the length to the RDMA fragment so that it can be completed during the remote ack. Be more explicit on the comment. The rdma_frag can only be freed once when the peer forced a protocol change (from RDMA GET to send/recv). Otherwise the fragment will be returned once all data pertaining to it has been trasnferred. NOTE: Had to add a typedef for "opal_atomic_size_t" from master into opal/threads/thread_usage.h into this cherry pick (it is in opal/include/opal_stdatomic.h on master, but that file does not exist here on the v4.0.x branch). Signed-off-by: George Bosilca <bosilca@icl.utk.edu> (cherry picked from commit a16cf0e4dd6df4dea820fecedd5920df632935b8) Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2019-05-03 06:20:02 -07:00
Mikhail Brinskii	e4ee56d1f3	SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h The new routine transfers the data asynchronously from the source PE to all PEs in the OpenSHMEM job. The routine returns immediately. The source and target buffers are reusable only after the completion of the routine. After the data is transferred to the target buffers, the counter object is updated atomically. The counter object can be read either using atomic operations such as shmem_atomic_fetch or can use point-to-point synchronization routines such as shmem_wait_until and shmem_test. Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> (cherry picked from commit 2ef5bd8b3671f1e10caf00d06d66d120eac9c5be)	2019-05-02 21:25:59 +03:00
Xin Zhao	69a80fce9f	ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy ompi/oshmem/spml/ucx: optimize spml ucx progress Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com> (cherry picked from commit 9c3d00b144641d2929f830279dcc9d163c38e9e1)	2019-03-21 23:59:58 +02:00
Xin Zhao	596997c194	ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com> (cherry picked from commit e1c1ab020227fc18d145379ab29ea86a3cdb66b1)	2019-03-21 23:58:23 +02:00
Howard Pritchard	15cfba5347	Merge pull request #6503 from jjhursey/v4x-rm-hash-pmix3 Do not force 'hash' gds on direct modex	2019-03-19 17:58:26 -05:00
Joshua Hursey	45526fadee	Do not force 'hash' gds on direct modex * Forcing the 'hash' gds component should not be necessary any more. Port of PR #6498 (component names changed so a cherry-pick would not work) Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2019-03-19 10:52:17 -05:00

1 2 3 4 5 ...

5407 Коммитов