1
1
Граф коммитов

21 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
35438ae9b5 mpi/finalized: revamp INITIALIZED/FINALIZED
Per MPI-3.1:8.7.1 p361:11-13, it's valid for MPI_FINALIZED to be
invoked during an attribute destruction callback (e.g., during the
destruction of keyvals on MPI_COMM_SELF during the very beginning of
MPI_FINALIZE).  In such cases, MPI_FINALIZED must return "false".

Prior to this commit, we hung in FINALIZED if it were invoked during
a COMM_SELF attribute destruction callback in FINALIZE.  See
https://github.com/open-mpi/ompi/issues/5084.

This commit converts the MPI_INITIALIZED / MPI_FINALIZED
infrastructure to use a single enum (ompi_mpi_state, set atomically)
to represent the state of MPI:

- not initialized
- init started
- init completed
- finalize started
- finalize past COMM_SELF destruction
- finalize completed

The "finalize past COMM_SELF destruction" state is what allows us to
return "false" from MPI_FINALIZED before COMM_SELF has been fully
destroyed / all attribute callbacks have been invoked.

Since this state is checked at nearly every MPI API call (to see if
we're outside of the INIT/FINALIZE epoch), care was taken to use
atomics to *set* the ompi_mpi_state value in ompi_mpi_init() and
ompi_mpi_finalize(), but performance-critical code paths can simply
read the variable without needing to use a slow call to an
opal_atomic_*() function.

Thanks to @AndrewGaspar for reporting the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-01 13:36:29 -07:00
Xin Zhao
4aad386c2b ompi/oshmem: fix bug in shmem_finalize.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-04-02 09:07:59 -05:00
Alex Mikheev
292d185c30
oshmem: refactor group cache
- Use opal hash table instead of list for group lookup.
- Code cleanup/refactoring. Group cache is now a part
  of the proc_group.

Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-02-22 11:48:06 +02:00
Nathan Hjelm
9d0b3fe9f4 opal/asm: remove opal_atomic_bool_cmpset functions
This commit eliminates the old opal_atomic_bool_cmpset functions. They
have been replaced by the opal_atomic_compare_exchange_strong
functions.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-11-30 10:41:22 -07:00
Nathan Hjelm
3ff34af355 opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*
This commit renames the atomic compare-and-swap functions to indicate
the return value. This is in preperation for adding support for a
compare-and-swap that returns the old value. At the same time the
return type has been changed to bool.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-10-31 12:47:23 -06:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Igor Ivanov
05d947d55a oshmem: Align OSHMEM API with spec v1.2 (support environment variables) 2015-11-24 18:57:56 +02:00
Gilles Gouaillardet
291a464efb configury: remove the --enable-mpi-profiling option
and directly call the PMPI_* symbols from C and Fortran bindings
2015-10-13 08:52:35 +09:00
Gilles Gouaillardet
53b952dc2b oshmem: invoke the C PMPI_* subroutines instead of the MPI_* ones
when profiling is built.
This prevents oshmem subroutines from being wrapped twice by third
party tools (e.g. once in oshmem and once in MPI)
see discussion starting at http://www.open-mpi.org/community/lists/devel/2015/08/17842.php

Thanks to Bert Wesarg for bringing this to our attention
2015-10-13 08:52:03 +09:00
Igor Ivanov
4b8d9b8eff oshmem/proc: Refactor proc component
Most functionality of oshmem_proc duplicates ompi_proc. In addition
to that, Current logic does not allow to do oshmem initialization
w/o ompi startup.
So this refactoring allows to  avoid code duplication, decrease used
memory and make oshmem support easier.
Now oshmem_proc is transparent ompi_proc structure, that can be
extended by oshmem specific data.

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-09-17 18:49:00 +03:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
Mike Dubman
323e4418b9 OSHMEM: extract memheap allocate methods into separate framework
- similar to opal/shmem
- next step is some refactoring and merge into opal/shmem
 Developed by Igor, reviewed by AlexM, MikeD

This commit fixes trac:4261.

This commit was SVN r30855.

The following Trac tickets were found above:
  Ticket 4261 --> https://svn.open-mpi.org/trac/ompi/ticket/4261
2014-02-26 16:32:23 +00:00
Mike Dubman
27a763c86c OSHMEM: finalization fixes
Fixes mxm endpoint destruction and hung during
SHMEM finalization.

Add a barrier between spml del procs and finalization.
Not having it caused hungs because ikrit spml can not properly
disconnect if its peer already finizalized.

Refs trac:3763

This commit was SVN r30572.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-02-06 08:40:43 +00:00
Mike Dubman
0d8424e0a6 OSHMEM: Fixes issue with recent segfault in finalize related to mca_base_group_unselect.
We need to explicitly call mca_base_group_unselect in finalize
for each group that are not freed with oshmem_group_cache_list_free before we unloading scoll framework.

Refs trac:3763

This commit was SVN r30311.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-17 16:57:12 +00:00
Mike Dubman
2e138ddd05 OSHMEM: Use MPI calls for mkey exchange
fixed by Alex, reviewed by miked
Refs: 3763

This commit was SVN r30056.
2013-12-23 09:20:42 +00:00
Mike Dubman
0ddc2bc214 C99ing ...
Refs: 3763

This commit was SVN r29756.
2013-11-26 12:46:56 +00:00
Joshua Ladd
b3f88c4a1d Per the RFC schedule, this commit adds Mellanox OpenSHMEM to the trunk. It does not yet run on OSX or with CM PML for an MTL other than MXM. Mellanox is aware of these issues and is in the process of resolving them. This should be added to \ncmr=v1.7.4:subject=Move OSHMEM to 1.7.4:reviewer=rhc
This commit was SVN r29153.
2013-09-10 15:34:09 +00:00
Joshua Ladd
70ad711337 Backing out the Open SHMEM project
This commit was SVN r28050.
2013-02-12 17:45:27 +00:00
Mike Dubman
ff384daab4 Added new project: oshmem.
This commit was SVN r28048.
2013-02-12 15:33:21 +00:00