1
1

8 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
35438ae9b5 mpi/finalized: revamp INITIALIZED/FINALIZED
Per MPI-3.1:8.7.1 p361:11-13, it's valid for MPI_FINALIZED to be
invoked during an attribute destruction callback (e.g., during the
destruction of keyvals on MPI_COMM_SELF during the very beginning of
MPI_FINALIZE).  In such cases, MPI_FINALIZED must return "false".

Prior to this commit, we hung in FINALIZED if it were invoked during
a COMM_SELF attribute destruction callback in FINALIZE.  See
https://github.com/open-mpi/ompi/issues/5084.

This commit converts the MPI_INITIALIZED / MPI_FINALIZED
infrastructure to use a single enum (ompi_mpi_state, set atomically)
to represent the state of MPI:

- not initialized
- init started
- init completed
- finalize started
- finalize past COMM_SELF destruction
- finalize completed

The "finalize past COMM_SELF destruction" state is what allows us to
return "false" from MPI_FINALIZED before COMM_SELF has been fully
destroyed / all attribute callbacks have been invoked.

Since this state is checked at nearly every MPI API call (to see if
we're outside of the INIT/FINALIZE epoch), care was taken to use
atomics to *set* the ompi_mpi_state value in ompi_mpi_init() and
ompi_mpi_finalize(), but performance-critical code paths can simply
read the variable without needing to use a slow call to an
opal_atomic_*() function.

Thanks to @AndrewGaspar for reporting the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-01 13:36:29 -07:00
Mark Allen
552216f9ba scripted symbol name change (ompi_ prefix)
Passed the below set of symbols into a script that added ompi_ to them all.

Note that if processing a symbol named "foo" the script turns
    foo  into  ompi_foo
but doesn't turn
    foobar  into  ompi_foobar

But beyond that the script is blind to C syntax, so it hits strings and
comments etc as well as vars/functions.

    coll_base_comm_get_reqs
    comm_allgather_pml
    comm_allreduce_pml
    comm_bcast_pml
    fcoll_base_coll_allgather_array
    fcoll_base_coll_allgatherv_array
    fcoll_base_coll_bcast_array
    fcoll_base_coll_gather_array
    fcoll_base_coll_gatherv_array
    fcoll_base_coll_scatterv_array
    fcoll_base_sort_iovec
    mpit_big_lock
    mpit_init_count
    mpit_lock
    mpit_unlock
    netpatterns_base_err
    netpatterns_base_verbose
    netpatterns_cleanup_narray_knomial_tree
    netpatterns_cleanup_recursive_doubling_tree_node
    netpatterns_cleanup_recursive_knomial_allgather_tree_node
    netpatterns_cleanup_recursive_knomial_tree_node
    netpatterns_init
    netpatterns_register_mca_params
    netpatterns_setup_multinomial_tree
    netpatterns_setup_narray_knomial_tree
    netpatterns_setup_narray_tree
    netpatterns_setup_narray_tree_contigous_ranks
    netpatterns_setup_recursive_doubling_n_tree_node
    netpatterns_setup_recursive_doubling_tree_node
    netpatterns_setup_recursive_knomial_allgather_tree_node
    netpatterns_setup_recursive_knomial_tree_node
    pml_v_output_close
    pml_v_output_open
    intercept_extra_state_t
    odls_base_default_wait_local_proc
    _event_debug_mode_on
    _evthread_cond_fns
    _evthread_id_fn
    _evthread_lock_debugging_enabled
    _evthread_lock_fns
    cmd_line_option_t
    cmd_line_param_t
    crs_base_self_checkpoint_fn
    crs_base_self_continue_fn
    crs_base_self_restart_fn
    event_enable_debug_output
    event_global_current_base_
    event_module_include
    eventops
    sync_wait_mt
    trigger_user_inc_callback
    var_type_names
    var_type_sizes

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:23 -04:00
Ralph Castain
1e2019ce2a Revert "Update to sync with OMPI master and cleanup to build"
This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.
2016-11-22 15:03:20 -08:00
Ralph Castain
cb55c88a8b Update to sync with OMPI master and cleanup to build
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-22 14:24:54 -08:00
Nathan Hjelm
a7b0c00ab6 fix memory leaks and valgrind errors
This commit fixes several vagrind errors. Included:

 - installdirs did not correctly reinitialize all pointers to NULL
   at close. This causes valgrind errors on a subsequent call to
   opal_init_tool.

 - several opal strings were leaked by opal_deregister_params which
   was setting them to NULL instead of letting them be freed by the
   MCA variable system.

 - move opal_net_init to AFTER the variable system is initialized and
   opal's MCA variables have been registered. opal_net_init uses a
   variable registered by opal_register_params!

 - do not leak ompi_mpi_main_thread when it is allocated by
   MPI_T_init_thread.

 - do not overwrite ompi_mpi_main_thread if it is already set (by
   MPI_T_init_thread).

 - mca_base_var: read_files was overwritting mca_base_var_file_list
   even if it was non-NULL.

 - mca_base_var: set all file global variables to initial states on
   finalize.

 - btl/vader: decrement enumerator reference count to ensure that it
   is freed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-11 09:28:35 -06:00
Nathan Hjelm
005c6022e2 mca/base: fix bugs in framework deregistration/re-registration
There were a number of bugs in the framework/variable code that
affected deregistration:

 - Frameworks could be erroneously closed if seperately registered and
   opened then subsequently closed. This was a bug in the original
   design which only reference counted opens but not
   registrations. This would cause undefined behavior if
   MPI_T_finalize actually calls ompi_info_close_components as
   intended. Now both registrations and opens are reference counted
   and frameworks/components are not torn down until the matching
   number of close calls have been made.

 - group_find_by_name did not pass the invalidok flags down
   to mca_base_var_group_get_internal correctly.

 - Group deregistration caused the group to be completely reset. This
   does not match the behavior required by MPI_T as it could reduce
   the number of variables/subgroups in a group.

This commit also updates MPI_T_finalize to call
ompi_info_close_components as originally intended.

Closes #374

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-09 16:52:53 -06:00
Jeff Squyres
da87b506bd Remove warnings identified by clang 3.4
* Remove unused static functions
 * Remove unused static variables

cmr=v1.8:reviewer=hjelmn

This commit was SVN r31023.
2014-03-12 13:17:54 +00:00
Nathan Hjelm
e6e9f2c6fd Add profiling function definitions for MPI_T and add a missing type into mpi.h
This commit was SVN r28803.
2013-07-16 16:03:33 +00:00