1
1
openmpi/opal/mca/base/mca_base_components_open.c

170 строки
6.4 KiB
C
Исходник Обычный вид История

/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
/*
* Copyright (c) 2004-2008 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2013 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
2015-06-24 06:59:57 +03:00
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
Refs trac:3275. We ran into a case where the OMPI SVN trunk grew a new acceptable MCA parameter value, but this new value was not accepted on the v1.6 branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts the value "silent", but on the older v1.6 branch, it doesn't). If you set "hwloc_base_mem_bind_failure_action=silent" in the default MCA params file and then accidentally ran with the v1.6 branch, every OMPI executable (including ompi_info) just failed because hwloc_base_open() would say "hey, 'silent' is not a valid value for hwloc_base_mem_bind_failure_action!". Kaboom. The only problem is that it didn't give you any indication of where this value was being set. Quite maddening, from a user perspective. So we changed the ompi_info handles this case. If any framework open function return OMPI_ERR_BAD_PARAM (either because its base MCA params got a bad value or because one of its component register/open functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out a warning that it received and error, and then dump out the parameters that it has received so far in the framework that had a problem. At a minimum, this will show the user the MCA param that had an error (it's usually the last one), and ''where it was set from'' (so that they can go fix it). We updated ompi_info to check for O???_ERR_BAD_PARAM from each from the framework opens. Also updated the doxygen docs in mca.h for this O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS and MCA_ERR_??? codes. Why? I think we used them in exactly one place in the code base (mca_base_components_open.c). So we deleted those and just used the normal OPAL_* codes instead. While we were doing this, we also cleaned up a little memory management during ompi_info/orte-info/opal-info finalization. Valgrind still reports a truckload of memory still in use at ompi_info termination, but they mostly look to be components not freeing memory/resources properly (and outside the scope of this fix). This commit was SVN r27306. The following Trac tickets were found above: Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-12 00:47:24 +04:00
* Copyright (c) 2008-2012 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2011-2015 Los Alamos National Security, LLC.
* All rights reserved.
* Copyright (c) 2014 Hochschule Esslingen. All rights reserved.
* $COPYRIGHT$
2015-06-24 06:59:57 +03:00
*
* Additional copyrights may follow
2015-06-24 06:59:57 +03:00
*
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
* $HEADER$
*/
#include "opal_config.h"
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "opal/class/opal_list.h"
#include "opal/util/argv.h"
#include "opal/util/output.h"
#include "opal/mca/mca.h"
#include "opal/mca/base/base.h"
#include "opal/constants.h"
#if OPAL_ENABLE_FT_CR == 1
#include "opal/runtime/opal_params.h"
#endif
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
/*
* Local functions
*/
static int open_components(mca_base_framework_t *framework);
struct mca_base_dummy_framework_list_item_t {
opal_list_item_t super;
mca_base_framework_t framework;
};
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
/**
* Function for finding and opening either all MCA components, or the
* one that was specifically requested via a MCA parameter.
*/
int mca_base_framework_components_open (mca_base_framework_t *framework,
mca_base_open_flag_t flags)
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
{
/* Open flags are not used at this time. Suppress compiler warning. */
if (flags & MCA_BASE_OPEN_FIND_COMPONENTS) {
bool open_dso_components = !(flags & MCA_BASE_OPEN_STATIC_ONLY);
/* Find and load requested components */
int ret = mca_base_component_find(NULL, framework, false, open_dso_components);
if (OPAL_SUCCESS != ret) {
return ret;
}
}
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
/* Open all registered components */
return open_components (framework);
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
}
/*
* Traverse the entire list of found components (a list of
* mca_base_component_t instances). If the requested_component_names
* array is empty, or the name of each component in the list of found
* components is in the requested_components_array, try to open it.
* If it opens, add it to the components_available list.
*/
static int open_components(mca_base_framework_t *framework)
{
opal_list_t *components = &framework->framework_components;
uint32_t open_only_flags = MCA_BASE_METADATA_PARAM_NONE;
int output_id = framework->framework_output;
mca_base_component_list_item_t *cli, *next;
int ret;
/*
* Pre-process the list with parameter constraints
* e.g., If requested to select only CR enabled components
* then only make available those components.
*
* JJH Note: Currently checkpoint/restart is the only user of this
* functionality. If other component constraint options are
* added, then this logic can be used for all contraint
* options.
*
* NTH: Logic moved to mca_base_components_filter.
*/
#if (OPAL_ENABLE_FT == 1) && (OPAL_ENABLE_FT_CR == 1)
if (opal_base_distill_checkpoint_ready) {
open_only_flags |= MCA_BASE_METADATA_PARAM_CHECKPOINT;
}
#endif /* (OPAL_ENABLE_FT == 1) && (OPAL_ENABLE_FT_CR == 1) */
2015-06-24 06:59:57 +03:00
/* If mca_base_framework_register_components was called with the MCA_BASE_COMPONENTS_ALL flag
we need to trim down and close any extra components we do not want open */
ret = mca_base_components_filter (framework, open_only_flags);
if (OPAL_SUCCESS != ret) {
return ret;
}
/* Announce */
opal_output_verbose (MCA_BASE_VERBOSE_COMPONENT, output_id, "mca: base: components_open: opening %s components",
framework->framework_name);
2015-06-24 06:59:57 +03:00
/* Traverse the list of components */
OPAL_LIST_FOREACH_SAFE(cli, next, components, mca_base_component_list_item_t) {
const mca_base_component_t *component = cli->cli_component;
2015-06-24 06:59:57 +03:00
opal_output_verbose (MCA_BASE_VERBOSE_COMPONENT, output_id,
"mca: base: components_open: found loaded component %s",
component->mca_component_name);
if (NULL != component->mca_open_component) {
/* Call open if register didn't call it already */
ret = component->mca_open_component();
Refs trac:3275. We ran into a case where the OMPI SVN trunk grew a new acceptable MCA parameter value, but this new value was not accepted on the v1.6 branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts the value "silent", but on the older v1.6 branch, it doesn't). If you set "hwloc_base_mem_bind_failure_action=silent" in the default MCA params file and then accidentally ran with the v1.6 branch, every OMPI executable (including ompi_info) just failed because hwloc_base_open() would say "hey, 'silent' is not a valid value for hwloc_base_mem_bind_failure_action!". Kaboom. The only problem is that it didn't give you any indication of where this value was being set. Quite maddening, from a user perspective. So we changed the ompi_info handles this case. If any framework open function return OMPI_ERR_BAD_PARAM (either because its base MCA params got a bad value or because one of its component register/open functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out a warning that it received and error, and then dump out the parameters that it has received so far in the framework that had a problem. At a minimum, this will show the user the MCA param that had an error (it's usually the last one), and ''where it was set from'' (so that they can go fix it). We updated ompi_info to check for O???_ERR_BAD_PARAM from each from the framework opens. Also updated the doxygen docs in mca.h for this O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS and MCA_ERR_??? codes. Why? I think we used them in exactly one place in the code base (mca_base_components_open.c). So we deleted those and just used the normal OPAL_* codes instead. While we were doing this, we also cleaned up a little memory management during ompi_info/orte-info/opal-info finalization. Valgrind still reports a truckload of memory still in use at ompi_info termination, but they mostly look to be components not freeing memory/resources properly (and outside the scope of this fix). This commit was SVN r27306. The following Trac tickets were found above: Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-12 00:47:24 +04:00
if (OPAL_SUCCESS == ret) {
opal_output_verbose (MCA_BASE_VERBOSE_COMPONENT, output_id,
"mca: base: components_open: "
"component %s open function successful",
component->mca_component_name);
} else {
if (OPAL_ERR_NOT_AVAILABLE != ret) {
/* If the component returns OPAL_ERR_NOT_AVAILABLE,
it's a cue to "silently ignore me" -- it's not a
failure, it's just a way for the component to say
2015-06-24 06:59:57 +03:00
"nope!".
Otherwise, however, display an error. We may end
up displaying this twice, but it may go to separate
streams. So better to be redundant than to not
display the error in the stream where it was
expected. */
2015-06-24 06:59:57 +03:00
if (mca_base_component_show_load_errors) {
opal_output_verbose (MCA_BASE_VERBOSE_ERROR, output_id,
"mca: base: components_open: component %s "
"/ %s open function failed",
component->mca_type_name,
component->mca_component_name);
}
opal_output_verbose (MCA_BASE_VERBOSE_COMPONENT, output_id,
"mca: base: components_open: "
"component %s open function failed",
component->mca_component_name);
}
mca_base_component_close (component, output_id);
opal_list_remove_item (components, &cli->super);
OBJ_RELEASE(cli);
}
}
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
}
2015-06-24 06:59:57 +03:00
/* All done */
2015-06-24 06:59:57 +03:00
return OPAL_SUCCESS;
- massive change for module<-->component name fixes throughout the code base. - many (most) mca type names have "component" or "module" in them, as relevant, just to further distinguish the difference between component data/actions and module data/actions. All developers are encouraged to perpetuate this convention when you create types that are specific to a framework, component, or module - did very little to entire framework (just the basics to make it compile) because it's just about to be almost entirely replaced - ditto for io / romio - did not work on elan or ib components; have to commit and then convert those on a different machine with the right libraries and headers - renamed a bunch of *_module.c files to *_component.c and *module*c to *component*c (a few still remain, e.g., ptl/ib, ptl/elan, etc.) - modified autogen/configure/build process to match new filenames (e.g., output static-components.h instead of static-modules.h) - removed DOS-style cr/lf stuff in ns/ns.h - added newline to end of file src/util/numtostr.h - removed some redundant error checking in the top-level topo functions - added a few {} here and there where people "forgot" to put them in for 1 line blocks ;-) - removed a bunch of MPI_* types from mca header files (replaced with corresponding ompi_* types) - all the ptl components had version numbers in their structs; removed - converted a few more elements in the MCA base to use the OBJ interface -- removed some old manual reference counting kruft This commit was SVN r1830.
2004-08-02 04:24:22 +04:00
}