If we ended up with no modules (e.g., all usnic devices were
excluded), there was a race condition in that the connectivity agent
could tear down its local socket before one or more of the local
clients saw it. Therefore, the local clients would timeout waiting
for the socket to appear.
So move the connectivity checker init later in the bootstrapping
process (it *must* be setup before module_init()), and have it only
invoked if we actually ended up with one or more modules.
Re-structure the loop looking for duplicates a little so that we only
have a single free of the string that happens regardless of whether we
found a duplicate or not.
This was Coverity CID 1288090
Fix previously-unfinished error paths during startup/bootstrapping.
Instead of just blindly continuing on when an fi_* function call
fails, opal_show_help and skip that device.
Also, only check the usnic config minimums once. They're VIC-wide and
won't change on a per-device basis -- we only need to check them once.
Fixes CSCut19179.
Embedding libltdl without the use of Libtool bootstrapping has
proven... difficult. Instead, create a new simple "dl" framework. It
only provides 4 functions:
- open a DSO (very similar to lt_dlopenadvise())
- lookup a symbol in a previously-opened DSO (very similar to lt_dlsym())
- close a previously-opened DSO (very similar to lt_dlclose())
- iterate over all files in a directory (very similar to ld_dlforeachfile())
There will be follow-on commits with a simple dlopen-based component
(nowhere near as complete/functional as libltdl, but good enough for
Linux and OS X), and a libltdl-based component for all other
platforms.
The intent is that the dlopen-based component can be built by default
in almost all cases. But if libltdl is available, that component will
be built. End result: we still get DSO-based functionality by default
in (almost?) all cases. Without embedding libltdl. Which is what we
want.
call the opal_common_verbs_mca_register function to make sure that
opal_common_verbs_want_fork_support mca parameter is created and therefore
can be used to control the fork support.
Both opal_shmem_base_select() and
opal_shmem_base_best_runnable_component_name() and were calling
opal_shmem_base_runtime_query(), which would do component selection
(and closing of losing components) twice.
Put protection in opal_shmem_base_runtime_query() to return the cached
results the second time. Additionally, make
opal_shmem_base_runtime_query() "own" the cached results
(vs. opal_shmem_base_select).
This allows the opal_shmem_base_module_t to be properly cast to an
mca_base_module_t.
(this commit is the rationale for the previous shmem C99 .member
initialization commit)