1
1
Граф коммитов

41 Коммитов

Автор SHA1 Сообщение Дата
Mikhail Kurnosov
73e048b62a coll/libnbc: add Rabenseifner's algorithm for MPI_Iallreduce
An implementation of R. Rabenseifner's algorithm for MPI_Iallreduce.

This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by an allgather.

Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-10-18 08:50:16 +07:00
Nathan Hjelm
43547ade4c
Merge pull request #5663 from mkurnosov/coll-ireduce-rabenseifner
coll/libnbc: add Rabenseifner's algorithm for MPI_Ireduce
2018-10-17 09:02:06 -06:00
Mikhail Kurnosov
a7386c1e09 coll/libnbc: add recursive doubling algorithm for MPI_Iallgather
Implements recursive doubling algorithm for MPI_Iallgather.
The algorithm can be used only for power-of-two number of processes.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-10-11 21:43:13 +07:00
Mikhail Kurnosov
b0429d25df coll/libnbc: add knomial tree algorithm for MPI_Ibcast
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-10-09 20:43:04 +07:00
Mikhail Kurnosov
7bd63e79c8 coll/libnbc: add Rabenseifner's algorithm for MPI_Ireduce
An implementation of R. Rabenseifner's algorithm for MPI_Ireduce.
This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by a gather.

Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-10-09 20:27:09 +07:00
Mikhail Kurnosov
9557fa087f Resolve merge conflicts
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-10-05 21:40:27 +07:00
Mikhail Kurnosov
dfe203e167 coll/libnbc: add recursive doubling algorithm for MPI_Iexscan
Implements recursive doubling algorithm for MPI_Iexscan.
The algorithm preserves order of operations so it can be used both
by commutative and non-commutative operations.

The MCA parameter 'coll_libnbc_iexscan_algorithm' was added for dynamic
algorithm selection.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-09-23 19:54:27 +07:00
Mikhail Kurnosov
3d43ff0f32 coll/libnbc: add recursive doubling algorithm for MPI_Iscan
Implements recursive doubling algorithm for MPI_Iscan. The algorithm preserves order of operations so it can be used both by commutative and non-commutative operations.

The MCA parameter coll_libnbc_iscan_algorithm was added for dynamic algorithm selection.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-09-22 21:09:12 +07:00
Nathan Hjelm
000f9eed4d opal: add types for atomic variables
This commit updates the entire codebase to use specific opal types for
all atomic variables. This is a change from the prior atomic support
which required the use of the volatile keyword. This is the first step
towards implementing support for C11 atomics as that interface
requires the use of types declared with the _Atomic keyword.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-09-14 10:48:55 -06:00
KAWASHIMA Takahiro
a38e9e064f coll: Update COLL module interface version to 2.3.0
Members for persistent operations are added to the module structure
in a prior commit.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-06-11 17:22:16 +09:00
KAWASHIMA Takahiro
0b8b0f8393 coll/libnbc: Implement MPI_STARTALL
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-06-11 17:22:16 +09:00
KAWASHIMA Takahiro
5c5de3a4fb coll/libnbc: Fix handling of completed request
Because a persistent reuqest does not free its `schedule` object
when the communication completes, the `NBC_Progress` function cannot
determine the completion using `schedule`.

Without this change, a hang occurs when the `NBC_Progress` function
is called recursively through the `NBC_Start_round` function.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-06-11 17:22:16 +09:00
KAWASHIMA Takahiro
8e5690bf5c coll/libnbc: Correct persistent request handling
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2018-06-11 17:22:16 +09:00
Gilles Gouaillardet
a9609b6bf8 coll/libnbc: add persistent collectives implementation
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-11 09:53:37 +09:00
Carlos Bederián
1767b218fb coll/libnbc: demote progress_lock to regular flag
Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>
2017-07-24 20:19:55 -03:00
Joshua Hursey
78006f93a4 coll: Move reduce_local into the coll framework
* Since we are adding a new function to `mca_coll_base_module_2_1_0_t`
   we need to increase the version of the module structure to `2_2_0`.
 * Add a comment just above the PREDEFINED_COMMUNICATOR_PAD describing
   it's purpose and when it should change. To help future developers
   trying to answer the question noted in the comment.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-14 08:56:07 -06:00
Gilles Gouaillardet
15098161a3 coll/libnbc: add some comments on how locks are used
no code change

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-30 17:29:51 +09:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Gilles Gouaillardet
2c94a3a6f3 coll/libnbc: fix race condition with multi threaded apps
protect the mca_coll_libnbc_component.active_requests list with
the new mca_coll_libnbc_component.lock mutex.

Thanks Jie Hu for the report

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-21 10:21:47 +09:00
Joshua Hursey
350ef67fe0 coll/libnbc: Work around for non-uniform data types in ibcast
* If (legal) non-uniform data type signatures are used in ibcast
   then the chosen algorithm may fail on the request, and worst case
   it could produce wrong answers.
 * Add an MCA parameter that, by default, protects the user from this
   scenario. If the user really wants to use it then they have to
   'opt-in' by setting the following parameter to false:
   - `-mca coll_libnbc_ibcast_skip_dt_decision f`
 * Once the following Issues are resolved then this parameter can
   be removed.
   - https://github.com/open-mpi/ompi/issues/2256
   - https://github.com/open-mpi/ompi/issues/1763

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:33:23 -05:00
Gilles Gouaillardet
e01bac962f coll: do not cast way the const modifier when this is not necessary
update the coll framework and mpi c bindings
2015-09-09 09:18:57 +09:00
Nathan Hjelm
d42e0968b1 coll/libnbc: rewrite parts of libnbc
This commit rewrites parts of libnbc to fix issues identified by
coverity and myself. The changes are as follows:

 - libnbc function would return invalid error codes (internal to
   libnbc) to the mpi layer. These codes names are of the form
   NBC_. They do not match up with the error codes expected by the mpi
   layer. I purged the use of all these error codes with the exception
   of NBC_OK and NBC_CONTINUE in progress. These codes are used to
   identify when a request handle is complete.

 - Handles and schedules were leaked by all collective routines on
   error. A new routine was added to return a collective handle
   (NBC_Return_handle).

 - Temporary buffers containting in/out neighbors for neighborhood
   collectives were always leaked.

 - Neigborhood collectives contained code to handle MPI_IN_PLACE which
   is never a valid input for the send or receive buffer. Stipped this
   code out.

 - Files were inconsistently named. Most are nbc_isomething.c but one
   was named coll_libnbc_ireduce_scatter_block.c.

 - Made the NBC_Schedule "structure" and object so it can be
   retained/released. This may enable the use of schedule caching at a
   later time. More testing will be needed to ensure the caching code
   works. If it doesn't the code should be stripped out completely.

 - Added code to simply common case of scheduling send/recv +
   barrier.

 - Code cleanup for readability.

The code now passes the clang static analyzer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-08-10 11:53:25 -06:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Nathan Hjelm
5f1254d710 Update code base to use the new opal_free_list_t
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.

This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.

Notes:

OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-24 10:05:45 -07:00
Gilles Gouaillardet
0f983d5a4f add a disable function for coll module 2014-10-14 14:46:36 +09:00
Nathan Hjelm
c5596548b2 MPI-3: Add support for neighborhood collectives
Blocking versions are simple linear algorithms implemented in coll/basic. Non-
blocking versions are from libnbc 1.1.1. All algorithms have been tested with
simple test cases.

cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r29265.
2013-09-26 21:55:08 +00:00
Aurelien Bouteiller
e1066143a4 rename ompi_free_list operations to _mt, as per discussions at last face to face meeting
This commit was SVN r28734.
2013-07-08 22:07:52 +00:00
George Bosilca
c9e5ab9ed1 Our macros for the OMPI-level free list had one extra argument, a possible return
value to signal that the operation of retrieving the element from the free list
failed. However in this case the returned pointer was set to NULL as well, so the
error code was redundant. Moreover, this was a continuous source of warnings when
the picky mode is on.

The attached parch remove the rc argument from the OMPI_FREE_LIST_GET and
OMPI_FREE_LIST_WAIT macros, and change to check if the item is NULL instead of
using the return code.

This commit was SVN r28722.
2013-07-04 08:34:37 +00:00
Brian Barrett
14f4aa1198 Fix memory leak in nbc init
This commit was SVN r27884.
2013-01-21 22:45:59 +00:00
Brian Barrett
c0f1775620 Fix warnings in nbc
This commit was SVN r27514.
2012-10-29 19:52:43 +00:00
Brian Barrett
8b40c0de9b * Lock around tag management, so that it's thread safe
* Only register the progress function on first call to a non-blocking
  collective operation, to try to reduce overall performance impact
* Fix tag management in roll-over case

This commit was SVN r27498.
2012-10-26 15:36:09 +00:00
Brian Barrett
58413fa1e4 * properly setup communication infrastructure for libnbc.
* Prevent infinite recursion in progress loop.

Should fix improper barrier eugene was seeing.

This commit was SVN r26758.
2012-07-06 13:59:03 +00:00
Brian Barrett
7e67bfa175 Use OMPI's ops instead of the libnbc ops.
This commit was SVN r26708.
2012-07-02 15:47:22 +00:00
Brian Barrett
0b887ab5a1 * Remove unneeded prototype that was causing compile issues anyway
* Use proper tag space (the negatives below the blocking communicators)
  instead of the point-to-point space
* Use the PML interface instead of the MPI interface, since the MPI
  interface 1) shouldn't be used by components and 2) doesn't like
  negative tags

This commit was SVN r26693.
2012-06-28 16:52:03 +00:00
Brian Barrett
32e70b691a Re-enable non-blocking collectives in libnbc after finding issue with the definition of
NBC_CACHE_SCHEDULE not being propogated to all uses.

This commit was SVN r26686.
2012-06-27 22:08:19 +00:00
Brian Barrett
d85fdd2605 temporarily back out r26682 and r26683 until I can figure out why they cause crashes during shutdown
This commit was SVN r26684.

The following SVN revision numbers were found above:
  r26682 --> open-mpi/ompi@15a30af11f
  r26683 --> open-mpi/ompi@f6ea4b7234
2012-06-27 19:32:53 +00:00
Brian Barrett
f6ea4b7234 Remove now unneeded header file
This commit was SVN r26683.
2012-06-27 18:43:40 +00:00
Brian Barrett
15a30af11f Turn on all the non-blocking collectives provided by libnbc...
This commit was SVN r26682.
2012-06-27 18:32:57 +00:00
Brian Barrett
3933d0a8f0 Ibarrier works! :)
This commit was SVN r26680.
2012-06-27 15:58:17 +00:00
Brian Barrett
7bdeafb772 Start bringing in libnbc. .ompi_ignored, as there's still a long way to go
This commit was SVN r26658.
2012-06-25 22:38:06 +00:00