Take another shot at untangling the spaghetti
orterun: fix for command line parsing
orte-submit calls opal_init_util () before parsing out MCA command line
options (-mca, -am, etc). This prevents mpirun from setting opal MCA
variables for some frameworks as well as the MCA base. This is because
when a framework is opened all of its variables are set to read-only.
Eventually we want to lift this restriction on some MCA variables but
since -mca is affected we must parse out the MCA command line options
before opal_init_util(). This commit fixes the bug by adding a new
option to opal_cmd_line_parse (ignore unknown option) so orte-submit
can pre-parse the command line for MCA options.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Minor cleanups to avoid releasing/recreating the cmd line
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
A few uninitialized common symbols are remaining (generated by flex) :
* orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_leng
* orte/mca/rmaps/rank_file/rmaps_rank_file_lex.c: orte_rmaps_rank_file_text
* orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_leng
* orte/util/hostfile/hostfile_lex.c: orte_util_hostfile_text
These two macros set the prefix for the OPAL and ORTE libraries,
respectively. Specifically, the OPAL library will be named
libPREFIXopen-pal.la and the ORTE library will be named
libPREFIXopen-rte.la.
These macros must be called, even if the prefix argument is empty.
The intent is that Open MPI will call these macros with an empty
prefix, but other projects (such as ORCM) will call these macros with
a non-empty prefix. For example, ORCM libraries can be named
liborcm-open-pal.la and liborcm-open-rte.la.
This scheme is necessary to allow running Open MPI applications under
systems that use their own versions of ORTE and OPAL. For example,
when running MPI applications under ORTE, if the ORTE and OPAL
libraries between OMPI and ORCM are not identical (which, because they
are released at different times, are likely to be different), we need
to ensure that the OMPI applications link against their ORTE and OPAL
libraries, but the ORCM executables link against their ORTE and OPAL
libraries.
the OPAL and ORTE libraries. This is required by projects such as ORCM
that have their own ORTE and OPAL libraries in order to avoid library
confusion. By renaming their version of the libraries, the OMPI
applications can correctly dynamically load the correct one for their
build."
This reverts commit 63f619f871.
http://www.open-mpi.org/community/lists/devel/2014/05/14822.php
Revamp the ORTE global data structures to reduce memory footprint and add new features. Add ability to control/set cpu frequency, though this can only be done if the sys admin has setup the system to support it (or you run as root).
This commit was SVN r31916.
Also, since I put some of the macros for these silent/verbose rules up
in the top-level Makefile.man-page-rules file, I renamed it to
Makefile.ompi-rules.
I've had this sitting around for a while; now seems like as good a
time as any to commit it.
This commit was SVN r31271.
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
Features:
- Support for an override parameter file (openmpi-mca-param-override.conf).
Variable values in this file can not be overridden by any file or environment
value.
- Support for boolean, unsigned, and unsigned long long variables.
- Support for true/false values.
- Support for enumerations on integer variables.
- Support for MPIT scope, verbosity, and binding.
- Support for command line source.
- Support for setting variable source via the environment using
OMPI_MCA_SOURCE_<var name>=source (either command or file:filename)
- Cleaner API.
- Support for variable groups (equivalent to MPIT categories).
Notes:
- Variables must be created with a backing store (char **, int *, or bool *)
that must live at least as long as the variable.
- Creating a variable with the MCA_BASE_VAR_FLAG_SETTABLE enables the use of
mca_base_var_set_value() to change the value.
- String values are duplicated when the variable is registered. It is up to
the caller to free the original value if necessary. The new value will be
freed by the mca_base_var system and must not be freed by the user.
- Variables with constant scope may not be settable.
- Variable groups (and all associated variables) are deregistered when the
component is closed or the component repository item is freed. This
prevents a segmentation fault from accessing a variable after its component
is unloaded.
- After some discussion we decided we should remove the automatic registration
of component priority variables. Few component actually made use of this
feature.
- The enumerator interface was updated to be general enough to handle
future uses of the interface.
- The code to generate ompi_info output has been moved into the MCA variable
system. See mca_base_var_dump().
opal: update core and components to mca_base_var system
orte: update core and components to mca_base_var system
ompi: update core and components to mca_base_var system
This commit also modifies the rmaps framework. The following variables were
moved from ppr and lama: rmaps_base_pernode, rmaps_base_n_pernode,
rmaps_base_n_persocket. Both lama and ppr create synonyms for these variables.
This commit was SVN r28236.
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.
This commit was SVN r27527.
The following SVN revision numbers were found above:
r27451 --> open-mpi/ompi@d59034e6ef
r27456 --> open-mpi/ompi@ecdbf34937
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
* fixed some bugs where "unknown" tokens were allowed on the command
line (which should really only be used for ortertun).
* if an unknown token is encountered, print a short error to stderr
and quit with a nonzero exit status
* if we don't find the right number of parameters to an option, print
a short error to stderr and quit with a nonzero exit status
* when --help is given, print the help message to stdout (not stderr)
and quit with a zero exit status
* added --showme:help option to the wrapper compilers
* updated docs in opal/util/cmd_line.h
* other small/miscellaneous CLI parsing bugs in various tools
I won't bore you with what we did before. :-) Here's some examples
of what the new behavior looks like:
{{{
% ompi_info --bogus
ompi_info: Error: unknown option "--bogus"
Type 'ompi_info --help' for usage.
% ompi_info --param bogus
ompi_info: Error: option "--param" did not have enough parameters (2)
Type 'ompi_info --help' for usage.
%
}}}
This commit was SVN r26072.
To enable the epochs and the resilient orte code, use the configure flag:
--enable-resilient-orte
This will define both:
ORTE_ENABLE_EPOCH
ORTE_RESIL_ORTE
This commit was SVN r25093.
It's not possible to combine two shared libraries on Windows, so we have to do it a bit different. First generate a small event static library by just linking the object files, and link it into other libraries that needs the libevent API.
This commit was SVN r24039.
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
back the native error code.
* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
(OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
decode 'ret' to get the native error code.
This commit was SVN r23162.
Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those.
The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed.
This commit was SVN r22824.