This commit is related to an RFC from June 2014. Disscussion can be
found at:
http://www.open-mpi.org/community/lists/devel/2014/07/15140.php
The finalize function is set using either the linker option -fini or
__attribute__((destructor)) depending on compiler support. I have
confirmed that this hybrid approach works with all the major
compilers. The attribute is supported by gcc, clang, llvm, xlc, and
icc. The fini function will support pgi. If a compiler/linker
combination does not support either the destructor or fini function a
message will be printed on re-init indicating it is not supported (an
improvement over the current behavior-- SEGV).
I moved the following to the destructor function:
- Class system finalize. This solves a bug when MPI_T_finalize is
called before MPI_Init. The only downside to this change is we will
leave the footprint of the opal class system after
MPI_Finalize. This footprint should be relatively small.
This is an alternative to #517 but the two PRs are not
mutually-exclusive (with some modifications). This commit should also
be safe for 1.8.x as it does not change internal or external ABI (#517
changes internal ABI).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit is a rework of the component repository. The changes
included in this commit are:
- Remove the component dependency code based off .ompi_info
files. This code is legacy code dating back 10 years that and is no
longer used.
- Move the plugin scanning code to the component repository. New
calls have been added to add new scanning paths, query available
components, and dlopen/load components.
- Pass the framework down to mca_base_component_find/filter. Eventually
the framework structure will be used to further validate components
before they are used.
- Add support to the MCA framework system to disable scanning for
dlopened components on open (support already existed in
register). This is really only relevant to installdirs as it has no
register function and no DSO components.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This adds a check at `make install` time to look for common symbols. It
attempts to ignore "Fortran-shaped" symbols by default. It also will
look in the source tree for any files named "common_sym_whitelist" and
will ignore any symbols listed in that file (one per line, comments
allowed).
See open-mpi/ompi#375 for more background.
This commit fixes a typo in mca_btl_vader_progress_endpoints where
OPAL_THREAD_LOCK was used when OPAL_THREAD_UNLOCK was intended.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
== Short version
Do not export special variables into the environment (e.g., LIBS,
LDFLAGS, etc.) when invoking subdir configure scripts. This prevents
problems described in open-mpi/ompi#471.
== More detail
Exporing special env variables before invoking a subdir configure
script causes problems in some cases. E.g., in open-mpi/ompi#471,
when the user configures with `--with-hwloc=/path/to/hwloc`, and that
directory is *not* in a default linker search location will cause the
libevent subdir configuration to fail.
This happens because:
1. We'll pass LIBS="-L/path/to/hwloc/lib -lhwloc" to the libevent
configure script
1. Meaning: configure-generated executables will link successfully
1. But unless LD_LIBRARY_PATH (or some other
tell-the-linker-where-to-find-things mechanism) includes
/path/to/hwloc/lib, the executable can't run.
Specifically, the libevent "hey, does the compiler generate proper
executables?" check will fail, and configure will abort (because OMPI
needs libevent).
I checked the history: exporting these vars dates all the way back to
LAM/MPI. I can't think of a reason why we need to export these
variables -- AC_CONFIG_SUBDIRs doesn't do it; subdir configure scripts
should be orthogonal from the upper-layer configure script (and its
variables). So let's remove these export statements and see if
anything breaks.
The oob/ud configure was not honoring the case
if the ompi is configured with --with-verbs=no.
This fixes that problems.
Fixes#522
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
OMPI *requires* a C99 compiler, so we don't need the AC_C_INLINE or
AC_C_RESTRICT tests anymore (because the C99 compiler guarantees to
support them). Indeed, in some cases (see open-mpi/ompi#491),
AC_C_INLINE gets the wrong answer and defines `inline` to be empty,
which screws up some systems (e.g., OS X with clang).
As an added bonus, we get to get rid of a bunch of gcc 2.96-specific
code (!) in configure.ac.
Based on some on-list and IM discussion with @hjelmn about
open-mpi/ompi@40b7643119, change the testing to a switch/case. If we
fall into the default case, assert() error (because it's an OMPI
developer programming error).
This commit fixes several vagrind errors. Included:
- installdirs did not correctly reinitialize all pointers to NULL
at close. This causes valgrind errors on a subsequent call to
opal_init_tool.
- several opal strings were leaked by opal_deregister_params which
was setting them to NULL instead of letting them be freed by the
MCA variable system.
- move opal_net_init to AFTER the variable system is initialized and
opal's MCA variables have been registered. opal_net_init uses a
variable registered by opal_register_params!
- do not leak ompi_mpi_main_thread when it is allocated by
MPI_T_init_thread.
- do not overwrite ompi_mpi_main_thread if it is already set (by
MPI_T_init_thread).
- mca_base_var: read_files was overwritting mca_base_var_file_list
even if it was non-NULL.
- mca_base_var: set all file global variables to initial states on
finalize.
- btl/vader: decrement enumerator reference count to ensure that it
is freed.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a bug when opal_error_init is called with the same
values multiple times. If opal_error_init is called too many times it
will start failing with OPAL_ERR_OUT_OF_RESOURCE. To fix the problem
check if an existing convertor matching the requested one and return
that one instead.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
putenv requires that any string put into the environment is not
changed or freed. That is not the case with constant strings as they
will go away when dlclose is called on the component. Instead, just
use opal_setenv which does not have this restriction.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>