PMIx_server_register_nspace() is an asynchronous operation, so
the pmix glue wait for it completes before returning.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
the argc field from the opal_pmix_app_t struct was removed,
so adjust the pmix/ext11 glue accordingly.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
- New MCA option: opal_stacktrace_output
- Specifies where the stack trace output stream goes.
- Accepts: none, stdout, stderr, file[:filename]
- Default filename 'stacktrace'
- Filename will be `stacktrace.PID`, or if VPID is available,
then the filename will be `stacktrace.VPID.PID`
- Update util/stacktrace to allow for different output avenues
including files. Previously this was hardcoded to 'stderr'.
- Since opal_backtrace_print needs to be signal safe, passing it a
FILE object that actually represents a file stream is difficult. This
is because we cannot open the file in the signal handler using
`fopen` (not safe), but have to use `open` (safe). Additionally, we
cannot use `fdopen` to convert the `int fd` to a `FILE *fh` since it
is also not signal safe.
- I did not want to break the backtrace.h API so I introduced a new
rule (documented in `backtrace.c`) that if the `FILE *file`
argument is `NULL` then look for the `opal_stacktrace_output_fileno`
variable to tell you which file descriptor to use for output.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Since the oob and connections systems do not work the same way they
did in older versions of Open MPI these operations are no longer
necessary. At best they do nothing and at worst they hurt performance
by making us enter the event library more often in opal_progress().
Fixes#2839
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
When Java bindings are used, MPI_Init() is not invoked
by the main thread, and this causes some keys being destructed twice.
Reset the per thread values to NULL in order to correctly handle this
Fixesopen-mpi/ompi#2811
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
* Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag`
to `stdout` instead of `stderr`.
* Add protection so that the user canot supply both:
- `orte_map_stddiag_to_stderr`
- `orte_map_stddiag_to_stdout`
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Adds a parameter to adjust the method used by libevent.
- Matches that of the libevent2022 component.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
* The AC_LANG_PROGRAM macro adds the `main()` so it is erroneous
to add it to the test program.
* This was detected with the XL compilers which will fail to
build the program in this situation. The GNU compiler does not
error out or warn, but successfully compiles the program.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
always check the permissions of the created directory,
in case some one else created the very same directory but
with incompatible permissions
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Due to the conversion from ssize_t to int we were losing bytes, and
ended up writing outside the receiver buffer. Similarly on the send,
due to the conversion to a lesser type, we could missinterpret the
end of the fragment.
The linux timer code was multiplying the result of the x86 time stamp
counter by 1000000 before dividing by the cpu frequency. This can
cause us to overflow 64 bits if the time stamp counter grows larger
than ~ 1.8e13 (about 8400 seconds after boot). To fix the issue the
units of opal_timer_linux_freq have been changed to MHz.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
the hwloc topology might not contain a NUMA object with hwloc < v2
if the node is not NUMA, so force the NUMA object count to one
in order to correctly allocate mca_btl_sm_component.sm_mpools.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Add a missing constraint to the input operand list.
This fixes a regression caused by d4be138a7b.
Thanks to Orion Poplawski for reporting the issue.
Refs #2610
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
This commit fixes a deadlock that can occur when the libc version
holds a lock when calling munmap. In this case we could end up calling
free() from vma_tree_delete which would in turn try to obtain the lock
in libc. To avoid the issue put any deleted vma's in a new list on the
vma module and release them on the next call to vma_tree_insert. This
should be safe as this function is not called from the memory hooks.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This should probably not go to the v2.x branch, since it changes the
output format of the usnic stats.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>