This fixes a mismatch between PS listing that returned
USERNAME but code was pruning based on UID.
This changes the OPAL_PS_FLAVOR_CHECK format to return
'uid' instead of 'user'. (Note: Avoiding call to
getlogin_r() but assuming UID is uniform on system,
same assumption exists for session dir anyway.)
Note, still maintains behavior from man page for root
running orte-clean on node (kills all orteds).
Adds 'orte-dvm' to list of procnames that will be checked/killed.
Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
Add more clarifying statements about our definition of "backwards
compatibility" -- adding an example with static linking and another
with containers.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
As we changed the ABI (forcing a major release), we can limit
the size of the predefined communicators by moving the collective
structure outside the communicator. This might have a minimal,
but unnoticeable, impact on performance. This approach has been
discussed during the January 2017 devel meeting.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Include a 'demo' component that shows some of the features.
* Currently has hooks for:
- MPI_Initialized
- top, bottom
- MPI_Init_thread
- top, bottom
- MPI_Finalized
- top, bottom
- MPI_Init
- top (pre-opal_init), top (post-opal_init), error, bottom
- MPI_Finalize
- top, bottom
* Other places in ompi can 'register' to hook into any one of these places
by passing back a component structure filled with function pointers.
* Add a `MCA_BASE_COMPONENT_FLAG_REQUIRED` flag to the MCA structure that
is checked by the `hook` framework. If a required, static component has
been excluded then the `hook` framework will fail to initialize.
- See note in `opal/mca/mca.h` as to why this is checked in the `hook`
framework and not in `opal/mca/base/mca_base_component_find.c`
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
in this context, AMD64 really means amd64 or em64t, so let's
rename this into X86_64 in order to avoid any confusion
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
there is no need to look for an assembly file when BUILTIN_GCC is used
Fixesopen-mpi/ompi#3032
Refs open-mpi/ompi#3036
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Add ability to pass DVM URI purely via environment
to simplify invocation from command-line (e.g., start dvm,
export URI, mpirun w/o needing to add `--hnp` arg).
If user passes both envvar *and* cmdline, the cmdline wins.
Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
Ensure that job errors do not cause the DVM to fail unless the failed job is the DVM itself.
Refs #2987, with improvements from Ralph
Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Return value in comment about opal_list_item_compare_fn_t typedef when a < b is indicated to be 11 instead of -1.
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
This commit makes the vma tree garbage collection list a lifo. This
way we can avoid having to hold any lock when releasing vmas. In
theory this should finally fix the hold-and-wait deadlock detailed
in #1654.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Need to check the entire value instead of just the last N digits. Otherwise, "-host 15" will match nid0015, nid0115, and any other launch id ending in 15
It appears strtol can return either a NULL or a zero-length string, so check for both cases
Signed-off-by: Ralph Castain <rhc@open-mpi.org>