openmpi

Автор	SHA1	Сообщение	Дата
George Bosilca	16b49dc5b3	A complete overhaul of the HAN code. Among many other things: - Fix an imbalance bug in MPI_allgather - Accept more human readable configuration files. We can now specify the collective by name instead of a magic number, and the component we want to use also by name. - Add the capability to have optional arguments in the collective communication configuration file. Right now the capability exists for segment lengths, but is yet to be connected with the algorithms. - Redo the initialization of all HAN collectives. Cleanup the fallback collective support. - In case the module is unable to deliver the expected result, it will fallback executing the collective operation on another collective component. This change make the support for this fallback simpler to use. - Implement a fallback allowing a HAN module to remove itself as potential active collective module, and instead fallback to the next module in line. - Completely disable the HAN modules on error. From the moment an error is encountered they remove themselves from the communicator, and in case some other modules calls them simply behave as a pass-through. Communicator: provide ompi_comm_split_with_info to split and provide info at the same time Add ompi_comm_coll_preference info key to control collective component selection COLL HAN: use info keys instead of component-level variable to communicate topology level between abstraction layers - The info value is a comma-separated list of entries, which are chosen with decreasing priorities. This overrides the priority of the component, unless the component has disqualified itself. An entry prefixed with ^ starts the ignore-list. Any entry following this character will be ingnored during the collective component selection for the communicator. Example: "sm,libnbc,^han,adapt" gives sm the highest preference, followed by libnbc. The components han and adapt are ignored in the selection process. - Allocate a temporary buffer for all lower-level leaders (length 2 segments) - Fix the handling of MPI_IN_PLACE for gather and scatter. COLL HAN: Fix topology handling - HAN should not rely on node names to determine the ordering of ranks. Instead, use the node leaders as identifiers and short-cut if the node-leaders agree that ranks are consecutive. Also, error out if the rank distribution is imbalanced for now. Signed-off-by: Xi Luo <xluo12@vols.utk.edu> Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu> Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-10-25 18:13:16 -04:00
George Bosilca	d71264569e	Fix the atomic management of the bcast and reduce freelist API consistent with other collective modules Add comments Other minor cleanups. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-08-24 12:13:38 -07:00
Xi Luo	fe73586808	Add ADAPT module Add comments in the ADAPT module Signed-off-by: Xi Luo <xluo12@vols.utk.edu> Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-08-24 12:13:38 -07:00
Howard Pritchard	f136a20cae	Merge pull request #6578 from hppritcha/topic/thread_framework2 Implement a MCA framework for threads	2020-03-27 15:55:48 -06:00
Shintaro Iwasaki	a7ea0d9bd7	ompi/request: move REQUEST constants from mca/threads to ompi/request Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>	2020-03-27 10:16:04 -06:00
Noah Evans	ee3517427e	Add threads framework Add a framework to support different types of threading models including user space thread packages such as Qthreads and argobot: https://github.com/pmodels/argobots https://github.com/Qthreads/qthreads The default threading model is pthreads. Alternate thread models are specificed at configure time using the --with-threads=X option. The framework is static. The theading model to use is selected at Open MPI configure/build time. mca/threads: implement Argobots threading layer config: fix thread configury - Add double quotations - Change Argobot to Argobots config: implement Argobots check If the poll time is too long, MPI hangs. This quick fix just sets it to 0, but it is not good for the Pthreads version. Need to find a good way to abstract it. Note that even 1 (= 1 millisecond) causes disastrous performance degradation. rework threads MCA framework configury It now works more like the ompi/mca/rte configury, modulo some edge items that are special for threading package linking, etc. qthreads module some argobots cleanup Signed-off-by: Noah Evans <noah.evans@gmail.com> Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov> Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2020-03-27 10:15:45 -06:00
Joseph Schuchart	dabdfe7153	grequestx: fix race condition in initialization Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2020-03-20 14:53:28 +01:00
Joseph Schuchart	4a39a34bab	grequestx: retain request object until it is removed from the list Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2020-03-20 14:52:42 +01:00
George Bosilca	72501f8f9c	Consistent return from all progress functions. This fix ensures that all progress functions return the number of completed events. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2020-01-28 20:16:53 +01:00
Joseph Schuchart	2c97187ee0	Harmonize return values of progress callbacks Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2020-01-28 20:15:03 +01:00
Joseph Schuchart	37e6bbb1e1	Ensure that grequestx continuously make progress Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>	2019-09-18 18:55:11 +02:00
Jeff Squyres	09d6740a72	Merge pull request #4897 from bosilca/topic/waitsome Be conservative with the array_of_indices	2018-09-18 12:34:22 -04:00
Nathan Hjelm	000f9eed4d	opal: add types for atomic variables This commit updates the entire codebase to use specific opal types for all atomic variables. This is a change from the prior atomic support which required the use of the volatile keyword. This is the first step towards implementing support for C11 atomics as that interface requires the use of types declared with the _Atomic keyword. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-09-14 10:48:55 -06:00
George Bosilca	a5fbfa476a	Be conservative with the array_of_indices We were assuming that the array_of_indices has the same size as the number of requests (incount), instead of the numberr of actually active requests. While the patch is trivial, the question of the size of the array_of_indices should be clarified in the MPI Forum. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-08-03 14:58:13 -04:00
Gilles Gouaillardet	7363906e4e	io/romio321: make grequest extensions internal Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-06-29 16:41:27 +09:00
Gilles Gouaillardet	e609cf7bc3	Merge pull request #5337 from ggouaillardet/topic/generalized_requests ompi/requests: implement generalized request extensions	2018-06-26 13:01:04 +09:00
Gilles Gouaillardet	383f23bf35	ompi/request: implement MPI Generalized request extensions so latest ROM-IO can be used with Open MPI. Note this first and naive implementation does not use the wait_fn callback. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-06-26 10:52:18 +09:00
KAWASHIMA Takahiro	e72f510daf	ompi/request: Add `ompi_request_persistent_noop_create` Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2018-06-11 17:22:16 +09:00
Nathan Hjelm	9d0b3fe9f4	opal/asm: remove opal_atomic_bool_cmpset functions This commit eliminates the old opal_atomic_bool_cmpset functions. They have been replaced by the opal_atomic_compare_exchange_strong functions. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-11-30 10:41:22 -07:00
Nathan Hjelm	3ff34af355	opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset* This commit renames the atomic compare-and-swap functions to indicate the return value. This is in preperation for adding support for a compare-and-swap that returns the old value. At the same time the return type has been changed to bool. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-10-31 12:47:23 -06:00
KAWASHIMA Takahiro	0cbdbe32f7	ompi/request: Support non-PML persistent requests This commit adds the `req_start` member to the `ompi_request_t` struct. The `MPI_START` and `MPI_STARTALL` routines call this callback function instead of `MCA_PML_CALL(start(...))`. So components that return persistent request must set this member to their request objects. `mca_pml_base_module_t::pml_start` is not deleted because `MCA_PML_CALL(start(...))` is still used elsewhere across OMPI. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2017-06-02 13:08:17 +09:00
Jeff Squyres	d520c24f3a	predefined MPI object padding: set to fixed number of bytes (#3634 ) Convert the predefined MPI object padding to a fixed number of bytes (vs. a multiple of sizeof(void*)) so that the padding is the same size between 32 and 64 bit builds. I.e., we won't have a situation where we've run out of padding in 32 bit builds but still have more space available in 64 bit builds. Fixes #3610 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-06-01 15:28:23 -04:00
bosilca	872cf44c28	Improve the opal_pointer_array & more (#3369 ) * Complete rewrite of opal_pointer_array Instead of a cache oblivious linear search use a bits array to speed up the management of the free space. As a result we slightly increase the memory used by the structure, but we get a significant boost in performance. Signed-off-by: George Bosilca <bosilca@icl.utk.edu> * Do not register datatypes in the f2c translation table. The registration is now done up into the Fortran layer, by forcing a call to MPI_Type_c2f. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-04-18 21:41:26 -04:00
Ralph Castain	dadc6fbaf6	Merge pull request #2448 from thananon/remove_request_lock Completely removed ompi_request_lock and ompi_request_cond	2017-01-03 19:31:46 -08:00
KAWASHIMA Takahiro	6510800c16	ompi/request: Fix a persistent request creation bug According to the MPI-3.1 p.52 and p.53 (cited below), a request created by `MPI_*_INIT` but not yet started by `MPI_START` or `MPI_STARTALL` is inactive therefore `MPI_WAIT` or its friends must return immediately if such a request is passed. The current implementation hangs in `MPI_WAIT` and its friends in such case because a persistent request is initialized as `req_complete = REQUEST_PENDING`. This commit fixes the initialization. Also, this commit fixes internal requests used in `MPI_PROBE` and `MPI_IPROBE` which was marked wrongly as persistent. MPI-3.1 p.52: We shall use the following terminology: A null handle is a handle with value MPI_REQUEST_NULL. A persistent request and the handle to it are inactive if the request is not associated with any ongoing communication (see Section 3.9). A handle is active if it is neither null nor inactive. An empty status is a status which is set to return tag = MPI_ANY_TAG, source = MPI_ANY_SOURCE, error = MPI_SUCCESS, and is also internally configured so that calls to MPI_GET_COUNT, MPI_GET_ELEMENTS, and MPI_GET_ELEMENTS_X return count = 0 and MPI_TEST_CANCELLED returns false. We set a status variable to empty when the value returned by it is not significant. Status is set in this way so as to prevent errors due to accesses of stale information. MPI-3.1 p.53: One is allowed to call MPI_WAIT with a null or inactive request argument. In this case the operation returns immediately with empty status. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2016-12-08 21:42:05 +09:00
Thananon Patinyasakdikul	6c5553c23c	Removed the unused ompi_request_[waiting, completed, failed, poll] variable. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>	2016-11-23 10:39:45 -05:00
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit `cb55c88a8b`.	2016-11-22 15:03:20 -08:00
Thananon Patinyasakdikul	b25a8c3fa5	Completely removed ompi_request_lock and ompi_request_cond as we dont need them anymore. Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>	2016-11-22 17:58:31 -05:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Joshua Hursey	f6f24a4f67	build: Custom libmpi(_FOO) name option in configure * Add a configure time option to rename libmpi(_FOO).* - `--with-libmpi-name=STRING` * This commit only impacts the installed libraries. Internal, temporary libraries have not been renamed to limit the scope of the patch to only what is needed. For example: ```shell shell$ ./configure --with-libmpi-name=wookie ... shell$ find . -name "libmpi" shell$ find . -name "libwookie" ./lib/libwookie.so.0.0.0 ./lib/libwookie.so.0 ./lib/libwookie.so ./lib/libwookie.la ./lib/libwookie_mpifh.so.0.0.0 ./lib/libwookie_mpifh.so.0 ./lib/libwookie_mpifh.so ./lib/libwookie_mpifh.la ./lib/libwookie_usempi.so.0.0.0 ./lib/libwookie_usempi.so.0 ./lib/libwookie_usempi.so ./lib/libwookie_usempi.la shell$ ```	2016-09-29 21:47:24 -05:00
George Bosilca	803897a915	Correctly indent the code.	2016-09-21 07:46:53 -04:00
Artem Polyakov	84e178ce94	Merge pull request #1821 from artpol84/fix_waitsome_v2 MPI_Waitsome performance improvement (version #2)	2016-09-08 13:55:37 +07:00
Gilles Gouaillardet	91e1200c14	ompi/request: correctly handle zero count in ompi_request_default_wait_{all,any,some}	2016-09-05 17:19:30 +09:00
Nathan Hjelm	6aa658ae33	ompi/request: change semantics of ompi request callbacks This commit changes the sematics of ompi request callbacks. If a request's callback has freed or re-posted (using start) a request the callback must return 1 instead of OMPI_SUCCESS. This indicates to ompi_request_complete that the request should not be modified further. This fixes a race condition in osc/pt2pt that could lead to the req_state being inconsistent if a request is freed between the callback and setting the request as complete. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-08-17 20:14:01 -06:00
George Bosilca	087761c2dc	Fix a warning and other small cleanups.	2016-08-02 17:33:53 +02:00
Nathan Hjelm	f38cc00df9	Merge pull request #1835 from hjelmn/thread_fix ompi/request: fix hang in ompi_request_wait_completion	2016-06-30 18:50:32 -06:00
Nathan Hjelm	445b79bba8	ompi/request: fix hang in ompi_request_wait_completion This commit fixes a hang reported by @nysal which happens when a request is completed after a sync object is created but before the sync object can be assigned to the request. In this case we need to set the sync signaling field to false to ensure WAIT_SYNC_RELEASE does not hang. Fixes open-mpi/ompi#1828 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-30 09:54:00 -06:00
Artem Polyakov	732d89095b	MPI_Waitsome performance improvement by avoiding extra atomic exchanges. Use indices array to mark already completed connections in the pre-wait loop to avoid extra atomic exchanges in the after-wait loop.	2016-06-29 20:40:41 +06:00
Artem Polyakov	541715572f	Fix MPI_Waitany and MPI_Waitsome (request handling related)	2016-06-28 16:40:00 +03:00
Nathan Hjelm	143a93f379	opal/sync: remove usage of OPAL_ENABLE_MULTI_THREADS The OPAL_ENABLE_MULTI_THREADS macro is always defined as 1. This was causing us to always use the multi-thread path for synchronization objects. The code has been updated to use the opal_using_threads() function. When MPI_THREAD_MULTIPLE support is disabled at build time (2.x only) this function is a macro evaluating to false so the compiler will optimize out the MT-path in this case. The OPAL_ATOMIC_ADD_32 macro has been removed and replaced by the existing OPAL_THREAD_ADD32 macro. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-22 09:52:37 -06:00
Nathan Hjelm	544adb9aed	ompi/request: fix performance regression This commit fixes a performance regression introduced by the request rework. We were always using the multi-thread path because OPAL_ENABLE_MULTI_THREADS is either not defined or always defined to 1 depending on the Open MPI version. To fix this I removed the conditional and added a conditional on opal_using_threads(). This path will be optimized out in 2.0.0 in a non-thread-multiple build as opal_using_threads is #defined to false in that case. Fixes open-mpi/ompi#1806 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-21 11:45:32 -06:00
Nathan Hjelm	5a4adb866d	ompi/request: fix loop conditional This commit fixes a bug in waitany that causes the code to go past the beginning of the request array. The loop conditional i >= 0 is invalid since i is unsigned. Changed to loop to check (i+1) > 0. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-07 10:28:46 -06:00
George Bosilca	fc5d458249	Consistency in handling OPAL_ENABLE_FT_CR. I am not sure if we should continue to maintain the request support for FT_CR, but I tried here to simplify the code while maintaining the same meaning.	2016-06-03 00:54:24 +09:00
Nathan Hjelm	b001184e63	request: fix warnings (#1742 ) Fix warnings introduced by request rework. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-06-02 04:53:16 -04:00
George Bosilca	bfcf145613	Refactor the request test and wait functions.	2016-06-02 11:58:25 +09:00
George Bosilca	50cec456fb	ompi_request_complete with signal Rewrite the ompi_request_complete function to take in account the with_signal argument. Change the comment to explain the expected behavior. Alter all the ompi_request_complete uses to make sure the status of the request is set before calling ompi_request_complete. bot🏷️enhancement	2016-06-02 11:49:12 +09:00
Nathan Hjelm	0591139f49	ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any This commit fixes two bugs in MPI_Wait_any: - If all requests are inactive then the sync wait would hang forever because no requests are attached to the sync. - The request pointer was pointing to the request before the completed request which caused the wrong request to be freed or marked inactive. MPI_Wait_some had a similar issue if all the requests were pending. These issues were identified by MTT. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-27 12:36:10 -06:00
Nathan Hjelm	ef11ba9394	request: fix compilation error The request.h header is unfortunately included files in the C++ bindings. C++ does not allow assigning from void * to another pointer without a cast. This commit adds the cast. We can clean this up when the C++ bindings are deleted. Fixes open-mpi/ompi#1707 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-05-25 09:52:23 -06:00
bosilca	b90c83840f	Refactor the request completion (#1422 ) * Remodel the request. Added the wait sync primitive and integrate it into the PML and MTL infrastructure. The multi-threaded requests are now significantly less heavy and less noisy (only the threads associated with completed requests are signaled). * Fix the condition to release the request.	2016-05-24 18:20:51 -05:00
George Bosilca	bf190671e9	Make the request lock recursive. If during the request completion callback we post another request that completes right away (such a small send or a match for an unexpected short message) we will try to complete the second request while holding the lock for the completion of the first. For performance reasons (mainly to avoid unlocking and locking the request mutex several times) we have made the request lock recursive.	2016-04-26 16:16:07 -04:00

1 2 3 4

154 Коммитов