1
1

337 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
ef01e130aa mca/base: protect mca_base_component_repository_release if dlopen support is disabled 2015-04-15 10:06:43 -06:00
Nathan Hjelm
e794658f2d Merge pull request #516 from hjelmn/repository_update
RFC: Repository update
2015-04-15 10:03:08 -06:00
Elena
96bdf595c2 fix for -am -tune options issue came from PR 520 2015-04-15 15:51:49 +03:00
Nathan Hjelm
c954f457d9 mca/base: update the way dynamic components are handled
This commit is a rework of the component repository. The changes
included in this commit are:

 - Remove the component dependency code based off .ompi_info
   files. This code is legacy code dating back 10 years that and is no
   longer used.

 - Move the plugin scanning code to the component repository. New
   calls have been added to add new scanning paths, query available
   components, and dlopen/load components.

 - Pass the framework down to mca_base_component_find/filter. Eventually
   the framework structure will be used to further validate components
   before they are used.

 - Add support to the MCA framework system to disable scanning for
   dlopened components on open (support already existed in
   register). This is really only relevant to installdirs as it has no
   register function and no DSO components.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-14 15:55:33 -06:00
Nathan Hjelm
a7b0c00ab6 fix memory leaks and valgrind errors
This commit fixes several vagrind errors. Included:

 - installdirs did not correctly reinitialize all pointers to NULL
   at close. This causes valgrind errors on a subsequent call to
   opal_init_tool.

 - several opal strings were leaked by opal_deregister_params which
   was setting them to NULL instead of letting them be freed by the
   MCA variable system.

 - move opal_net_init to AFTER the variable system is initialized and
   opal's MCA variables have been registered. opal_net_init uses a
   variable registered by opal_register_params!

 - do not leak ompi_mpi_main_thread when it is allocated by
   MPI_T_init_thread.

 - do not overwrite ompi_mpi_main_thread if it is already set (by
   MPI_T_init_thread).

 - mca_base_var: read_files was overwritting mca_base_var_file_list
   even if it was non-NULL.

 - mca_base_var: set all file global variables to initial states on
   finalize.

 - btl/vader: decrement enumerator reference count to ensure that it
   is freed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-11 09:28:35 -06:00
Nathan Hjelm
9cd955badf opal: fix multiple bugs in MCA and opal
This commit fixes the following bugs:

 - opal_output_finalize did not properly set internal state. This
   caused problems when calling the sequence opal_output_init (),
   opal_output_finalize (), opal_output_init ().

 - opal_info support called mca_base_open () but never called the
   matching mca_base_close (). mca_base_open () and mca_base_close ()
   have been updated to use a open count instead of an open flag to
   allow mca_base_open to be called through multiple paths (as may be
   the case when MPI_T is in use).

 - orte_info support did not register opal variables. This can cause
   orte-info to not return opal variables.

 - opal_info, orte_info, and ompi_info support have been updated to
   use a register count.

 - When opening the dl framework the reference count was added to
   ensure the framework stuck around. The framework being closed
   prematurely was a bug in the MCA base that has since been
   corrected. The increment (and associated decrement) have been
   removed.

 - dl/dlopen did not set the value of
   mca_dl_dlopen_component.filename_suffixes_mca_storage on each call
   to register. Instead the value was set in the component
   structure. This caused the value to be lost when re-loading the
   component. Fixed by setting the default value in register.

 - Reset shmem framework state on close to avoid returning a stale
   component after reloading opal/shmem.

 - MCA base parameters were not properly deregistered when the MCA
   base was closed.

This commit may fix #374.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-07 19:13:20 -06:00
Mike Dubman
58d002098b Merge pull request #474 from elenash/master
Introduce -tune command line option to set env vars and mca params from ...
2015-04-01 08:23:34 +03:00
Nathan Hjelm
6d1a41611f MCA/base: Detect overlapping variable names.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:05 -06:00
Nathan Hjelm
de1e7d58e1 Add support setting variables from project_name environment variables
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:05 -06:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
Elena
90f5b2bb84 Introduce -tune command line option to set env vars and mca params from file 2015-03-26 18:33:53 +02:00
Nathan Hjelm
ccba8ce856 Merge pull request #457 from hjelmn/mpit_fixes
mca/base: fix bugs in framework deregistration/re-registration
2015-03-18 08:37:49 -06:00
Nathan Hjelm
fd78491768 Merge pull request #451 from elenash/master
fix: mca_base_env_var mca parameter is never handled if it's set from am...
2015-03-11 09:54:25 -06:00
Nathan Hjelm
005c6022e2 mca/base: fix bugs in framework deregistration/re-registration
There were a number of bugs in the framework/variable code that
affected deregistration:

 - Frameworks could be erroneously closed if seperately registered and
   opened then subsequently closed. This was a bug in the original
   design which only reference counted opens but not
   registrations. This would cause undefined behavior if
   MPI_T_finalize actually calls ompi_info_close_components as
   intended. Now both registrations and opens are reference counted
   and frameworks/components are not torn down until the matching
   number of close calls have been made.

 - group_find_by_name did not pass the invalidok flags down
   to mca_base_var_group_get_internal correctly.

 - Group deregistration caused the group to be completely reset. This
   does not match the behavior required by MPI_T as it could reduce
   the number of variables/subgroups in a group.

This commit also updates MPI_T_finalize to call
ompi_info_close_components as originally intended.

Closes #374

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-09 16:52:53 -06:00
Jeff Squyres
a9d86129c6 mca base: convert to opal_dl interface 2015-03-09 08:16:55 -07:00
Elena
737f06dd68 fix: mca_base_env_var mca parameter is never handled if it's set from amca conf file 2015-03-06 12:12:26 +02:00
George Bosilca
75479c0f17 Fix some typos. 2015-03-05 12:59:58 -05:00
Gilles Gouaillardet
852dbafd51 mca/base: fix misc memory leaks
as reported by Coverity with CIDs 710628, 1196713 and 1269855
2015-03-05 14:06:18 +09:00
Jeff Squyres
3e8f468709 mca_base_framework: use the right type for dequeued list items
The items on the list are (mca_base_component_list_item_t*)'s, not
(mca_base_component_t*)'s.
2015-02-23 08:30:58 -08:00
Nathan Hjelm
0e09b9298a mca/base: add framework flag indicating a framework does not have
dso components

This flag is needed for a special case framework: dl. The framework is
needed before any dl components can be used.
2015-02-18 14:03:51 -07:00
Jeff Squyres
6d3a84514f mca_base_cmd_line.c: fix minor memory leak
This was CID 1269874.
2015-02-12 13:41:29 -08:00
Jeff Squyres
f8e334357d mca_base_pvar.c: protect removal from list
Only remove it from the list if it is actually on the list.

This was CID 1269758.
2015-02-12 13:41:29 -08:00
Nathan Hjelm
49ba150972 mca/base: fix path string parsing
CID 993709
2015-02-12 13:03:46 -07:00
Jeff Squyres
00c878957c mca_base_var.c: add debug check for another programming error
Coverity alerted us to the fact that there are places where
the synonym_for param is hard-coded to -1 when calling
register_variable().  It would be a coding error if synonym_for==-1
and (flags & MCA_BASE_VAR_FLAG_SYNONYM)>0, so let's add that to the
debug-only check at the top of the function.

This was CID 993717.
2015-02-12 10:24:02 -08:00
Jeff Squyres
f39b294afe mca base: fix trivial typos in help message 2014-11-12 08:40:17 -08:00
Ralph Castain
894acb0aa8 configury: new OPAL_SET_MCA_PREFIX/ORTE_SET_MCA_CMD_LINE_ID macros
These two macros set the MCA prefix and MCA cmd line id,
   respectively.  Specifically, MCA parameters will be named
   PREFIX<foo> in the environment, and the cmd line will use
   -ID foo bar.

   These macros must be called during configure.ac and a value
   supplied. In the case of Open MPI, the values given are
   PREFIX=OMPI_MCA_ and ID=mca.

   Other projects (such as ORCM) will call these macros with
   their own unique values.  For example, ORCM uses PREFIX=ORCM_MCA_
   and ID=omca

   This scheme is necessary to allow running Open MPI applications under
   systems that use their own versions of ORTE and OPAL.  For example,
   when running OMPI applications under ORCM, we need the MCA params passed
   to the ORCM daemons to be separated from those recognized by the OMPI application.
2014-10-22 18:57:40 -07:00
Ralph Castain
b6aa691e0a Fix incorrect implementation of new MCA param mca_base_env_list - it was not picking up envars and forwarding them, but only worked if you explicitly set a value for the envar. Ensure it works for both direct and indirect launch modes. Remove stale code as this replaced orte_forward_envars. Ensure it doesn't get passed to the ORTE daemons. 2014-10-16 12:58:56 -07:00
Jeff Squyres
dc66e197cc var: fix segv in deprecated file var show_help()
Ensure to include the new variable filename in the show_help() output
when we load a deprecated MCA param from a file.

Fixes #236
2014-10-15 08:07:31 -07:00
Ralph Castain
fad4384463 Not sure how we could get to this point without having already detected the error, but just to be safe - check for end-of-array and return if error.
Refs trac:4897

This commit was SVN r32731.

The following Trac tickets were found above:
  Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897
2014-09-13 02:23:30 +00:00
Ralph Castain
0445052a1c Check for multiple declarations of a given MCA param and error out if detected as that can create an ambiguous definition of the param value.
Refs trac:4897

This commit was SVN r32719.

The following Trac tickets were found above:
  Ticket 4897 --> https://svn.open-mpi.org/trac/ompi/ticket/4897
2014-09-12 22:21:30 +00:00
Jeff Squyres
d244b7b860 mca_base_var: fix possibilty of unaligned variable assignments
Add a debugging check that ensures that the registered storage is
aligned appropriately for the type that is specified.

When we know that the storage is properly aligned, we can cast the
mbv_storage to the appropriate type and then simply do the assignment.
We used to do this assignment via a union, but clang's
-fsanitizer=alignment complained about this.

This commit was SVN r32716.
2014-09-11 23:02:49 +00:00
Ralph Castain
4df1aa63f7 Since we've run into the situation where someone puts a script wrapper around a launcher such as srun, we need to always protect MCA cmd line params with quotes. This means we also need to protect the backend from quotes coming into the system as part of a value, or else the parser gets confused.
So add a new function for wrapping MCA arguments, and tell the backend parser to ignore/remove leading/trailing quotes.

cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32686.
2014-09-08 20:38:46 +00:00
Jeff Squyres
a896f90712 btl_base_select: fix faulty/incorrect show_help message
When no components were able to be found, btl_base_select() was
showing the wrong help message -- one that indicated that a specific
component could not be found.  And it left off a string argument, so
the end of the help message was garbage.

This commit creates a new help message for this case and updates the
show_help call to use the new message.

This commit was SVN r32572.
2014-08-22 01:53:38 +00:00
Jeff Squyres
ca0ccc5321 headers: remove trailing commas in enum lists
Per http://www.open-mpi.org/community/lists/devel/2014/08/15576.php,
trailing commas are not valid in enum lists in C++ until C++11.

cmr=v1.8.2:reviewer=rhc

This commit was SVN r32482.
2014-08-09 12:04:17 +00:00
George Bosilca
97de458cd7 Fix.
This commit was SVN r32382.
2014-07-31 21:54:07 +00:00
Nathan Hjelm
a32d93ec20 mca/base: make clang static analyzer happy
cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32360.
2014-07-30 17:45:28 +00:00
Nathan Hjelm
97fad1dd95 mca/base: ensure component version parameters get deregistered when the
component gets dlclosed

cmr=v1.8.2:reviewer=rhc

This commit was SVN r32359.
2014-07-30 17:45:23 +00:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
Ralph Castain
354a1d07a6 Silence warning about set-but-unused var
This commit was SVN r32186.
2014-07-09 22:37:05 +00:00
Mike Dubman
6efc9a7329 opal mca: externalize delimiter for base_env_list mca parameters
fixed by Elena, reviewed by MikeD

This commit was SVN r32182.
2014-07-09 18:55:49 +00:00
Joshua Ladd
801e2cb544 Fix error and warning messages after reverting
the mca_base_env_list to being semicolon delimited.

This commit was SVN r32179.
2014-07-09 14:46:19 +00:00
Mike Dubman
32aeba4bdf opal: revert env_list separator to ";" from "+"
This commit was SVN r32172.
2014-07-09 05:38:51 +00:00
Joshua Ladd
057370364d Opal: Add a new MCA variable type "version_string". Also add a
new flag to ompi_info that allows a user to print all MCA variables of a specific type.  

 --type version_string

This command will print all MCA variables of type version_string.

This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.

This commit was SVN r32166.
2014-07-09 01:37:23 +00:00
Joshua Ladd
30da6d3a17 Opal: add a new MCA parameter that allows the user to specify a list of environment variables. This parameter will become the standard mechanism by which environment variables are set for OMPI applications replacing the -x option.
mpirun ... -x env_foo1=val1 -x env_foo2 -x env_foo3=val3  should now be expressed as

mpirun ... -mca mca_base_env_list env_foo1=val1+env_foo2+env_foo3=val3. 

The motivation for doing this is so that a list of environment variables may be set via standard MCA mechanisms such as mca parameter files, amca lists, etc. 

This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.

This commit was SVN r32163.
2014-07-09 00:38:25 +00:00
Jeff Squyres
86f747c627 mca_base_var: use ERR_NOT_FOUND when var is inactive
After discussion with Nathan, change the ERR_VALUE_OUT_OF_BOUNDS to be
ERR_NOT_FOUND, for two reasons:

1. It's consistent with other uses of ERR_NOT_FOUND in the code.
1. In this case, we could have just looked up a variable that is
   basically a "hole" -- e.g., var indexes 0, 1, 2 are valid, and 4,
   5, and 5 are valid, but 3 is invalid (e.g., 3 was de-registered --
   remember that MPI_T explicitly does not allow re-using indexes).
   So returning ERR_OUT_OF_BOUNDS seems weird -- returning
   ERR_NOT_FOUND seems a bit more natural.

cmr=v1.8.2:ticket=trac:4587

This commit was SVN r32158.

The following Trac tickets were found above:
  Ticket 4587 --> https://svn.open-mpi.org/trac/ompi/ticket/4587
2014-07-08 20:00:18 +00:00
Ralph Castain
20535bca19 Reorder the var release so a debugger can still see the var name that caused a segfault, thus helping to identify the var in question
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32068.
2014-06-24 13:51:31 +00:00
Ralph Castain
1a53d541ab Cleanup memory leak
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32051.
2014-06-19 18:56:57 +00:00
Ralph Castain
5602156a1c Use the correct abstraction layer name for the data dirs
This commit was SVN r31684.
2014-05-08 14:32:24 +00:00
Ralph Castain
11faab1091 The final step of the RFC: convert the <foo>libdir and friends to fit their respective code areas, and equate them all at the top. Note that we can't entirely separate things as the opal_install_dirs framework can't handle separated locations for the various trees.
This commit was SVN r31679.
2014-05-08 02:01:35 +00:00
Nathan Hjelm
a28012b29d Fix MPI_T issues identified by friendly users.
Several fixes:

 - I was allowing an MPI_T_cvar_handle to be created for an invalid
   variable. Fixed this by checking if the variable is valid in
   mca_base_var_get.

 - Use a better error code when the caller tries to create an unbound
   pvar handle for a bound variable.

 - Return the verbosity level in MPI_T_cvar_get_info.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r31576.
2014-04-30 22:10:30 +00:00