1
1
Граф коммитов

683 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
6216225bda Ensure cleanup of registered files/dirs
Resolve a race condition between registering for a file to be removed upon termination and actual creation of that file by providing attributes that identify whether the path is a file or directory. This removes the need for PMIx to detect the difference.

Refs #4686

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 11:05:30 -08:00
Ralph Castain
07427c6d89 Update to PMIx v3.0 PR for cleanup registration
If available, have apps use registration capability to cleanup their session directories. Setup capability for vader to register its shared memory file location - let someone familiar with that code do so.

Final cleanup to track uid/gid, update the opal/pmix API to pass flags for ignore and leave top directory alone

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-12-18 06:53:11 -08:00
Ralph Castain
3906aaf41a Silence warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-11-25 11:50:18 -08:00
Wojtek Wasko
276de13a1e Make interface's kernel index an int instead of int16_t
Sometimes, the ethernet interfaces can get quite high kernel indices. struct
ifreq (see netdevice(7)) defines ifr_ifindex to be int's. The OOB component
used int16_t internally for matching (in case of -mca oob_tcp_if_[in|ex]clude)
which meant that any interface index > 32767 would never be matched because the
integer would be truncated to int16_t upon return from the function. OOB would
then refuse to work because it didn't find any usable interfaces and MPI job
would abort.

Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
2017-11-15 04:32:26 -05:00
George Bosilca
8f32b345de
Address syslog issues on OSX 10.13 with gcc 7.x
gcc 7.[1,2] (at least) fails to correctly parse the OSX 10.13 sys/syslog.h
header. As a results we need to potect syslog support in OPAL, PMIX and
ORTE.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-10-23 14:02:10 -04:00
Gilles Gouaillardet
b3558f261b opal/util: initialize proc_hostname in the opal_proc_t constructor
Refs open-mpi/ompi#4264

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-26 10:47:26 +09:00
Brian Barrett
502f383f4d util: Add link-local check to net interface
Add a check for link-local IPv6 addresses to the net
interface to support better computation of network
pairings in the weighted reachable component.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-19 19:42:54 -07:00
Brian Barrett
abbe2ffb9f util: Fix graph allocation size
Fix an allocation bug that could occur on non-LP64 platforms.
match_edges_out is an array of integers representing the
edges of the graph (where vertices are ints), with two ints
for every edge.  The previous code allocated enough space
for num_dges * sizeof(int*), which happens to be the same
as num_edges * 2 * sizeof(int) on LP64 platforms, but would
be wrong on all other platforms.

Fixes: CID 1417754

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-17 19:49:26 +00:00
Brian Barrett
bffcc3bca0 util: move graph solver from usnic to util
Cisco wrote a bipartite graph solver to properly solve
interface pair selection for usNIC.  Using the reachable
framework, the TCP BTL (and possibly the runtime network
code) can use the graph solver to make more optimal pair
selection.  Jeff was happy to have the code more broadly
used, but didn't have time to do the move, hence this
commit.

There are a couple of minor changes to the code compared
to the usNIC version.  Obviously, the functions have
been renamed to match naming convention for their new
home.  Since it's easier to write unit tests for
util/ code, the unit tests have been made first class
tests run at "make check" time.  This last bit required
moving some of the definitions into a new header,
bipartite_graph_internal.h, so that they could be
included in both the library code and the test code.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-15 15:08:47 -07:00
Josh Hursey
8688219091 Merge pull request #3775 from jjhursey/fix/mca_base_verbose-file
opal/mca: Fix mca_base_verbose file suffix processing
2017-07-18 10:14:42 -05:00
Jeff Squyres
ccf17808b6 Merge pull request #3258 from markalle/pr/symbol_name_pollution
symbol name pollution
2017-07-12 16:19:25 -05:00
Artem Polyakov
832f1b03a4 Merge pull request #3790 from artpol84/orte/iof_sbatch
orte/iof: Address the case when output is a regular file
2017-07-12 09:38:01 -05:00
Mark Allen
552216f9ba scripted symbol name change (ompi_ prefix)
Passed the below set of symbols into a script that added ompi_ to them all.

Note that if processing a symbol named "foo" the script turns
    foo  into  ompi_foo
but doesn't turn
    foobar  into  ompi_foobar

But beyond that the script is blind to C syntax, so it hits strings and
comments etc as well as vars/functions.

    coll_base_comm_get_reqs
    comm_allgather_pml
    comm_allreduce_pml
    comm_bcast_pml
    fcoll_base_coll_allgather_array
    fcoll_base_coll_allgatherv_array
    fcoll_base_coll_bcast_array
    fcoll_base_coll_gather_array
    fcoll_base_coll_gatherv_array
    fcoll_base_coll_scatterv_array
    fcoll_base_sort_iovec
    mpit_big_lock
    mpit_init_count
    mpit_lock
    mpit_unlock
    netpatterns_base_err
    netpatterns_base_verbose
    netpatterns_cleanup_narray_knomial_tree
    netpatterns_cleanup_recursive_doubling_tree_node
    netpatterns_cleanup_recursive_knomial_allgather_tree_node
    netpatterns_cleanup_recursive_knomial_tree_node
    netpatterns_init
    netpatterns_register_mca_params
    netpatterns_setup_multinomial_tree
    netpatterns_setup_narray_knomial_tree
    netpatterns_setup_narray_tree
    netpatterns_setup_narray_tree_contigous_ranks
    netpatterns_setup_recursive_doubling_n_tree_node
    netpatterns_setup_recursive_doubling_tree_node
    netpatterns_setup_recursive_knomial_allgather_tree_node
    netpatterns_setup_recursive_knomial_tree_node
    pml_v_output_close
    pml_v_output_open
    intercept_extra_state_t
    odls_base_default_wait_local_proc
    _event_debug_mode_on
    _evthread_cond_fns
    _evthread_id_fn
    _evthread_lock_debugging_enabled
    _evthread_lock_fns
    cmd_line_option_t
    cmd_line_param_t
    crs_base_self_checkpoint_fn
    crs_base_self_continue_fn
    crs_base_self_restart_fn
    event_enable_debug_output
    event_global_current_base_
    event_module_include
    eventops
    sync_wait_mt
    trigger_user_inc_callback
    var_type_names
    var_type_sizes

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:23 -04:00
Mark Allen
efc25168cd symbol name pollution: making some vars static
As part of addressing symbol name pollution, I'm switching a few
vars/functions to static.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:22 -04:00
Gilles Gouaillardet
ff2dd69533 opal/util: silence warning in opal_info_dup_mode()
as reported by coverity with CID 1414729

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-11 14:40:37 +09:00
Gilles Gouaillardet
85ff3ebad1 opal: fix return status of opal_info_set()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-11 13:58:15 +09:00
Gilles Gouaillardet
92441accc9 opal/info: fix recursive deadlock in opal_info_dup_mode()
use opal_info_{get,set}_nolock() instead of opal_info_{get,set}()
since the former can be invoked when the info lock is being held.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-10 14:51:46 +09:00
Artem Polyakov
d9ad918a14 orte/iof: Address the case when output is a regular file
Regular files are always write-ready, so non-blocking I/O does not
give any benefits for them.
More than that - if libevent is using "epoll" to track fd events,
epoll_ctl will refuse attempt to add an fd pointing to a regular
file descriptor with EPERM.
This fix checks the object referenced by fd and avoids event_add
using event_active instead.

In the original configuration that uncovered this issue "epoll"
was used in libevent, it was triggering the following warning
message:
"[warn] Epoll ADD(1) on fd 0 failed.  Old events were 0; read
change was 1 (add); write change was 0 (none): Operation not
permitted"
And the side effect was accumulation of all output in mpirun
memory and actually writing it only at mpirun exit.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-07-01 02:24:14 +07:00
Joshua Hursey
3b780ac137 opal/mca: Fix mca_base_verbose file suffix processing
* `-mca mca_base_verbose file:foo` should create an output file with
    the suffix `foo`. But since we free the pointer at the end of this
    function then by the time we use it it is pointing to invalid memory.
 * This commit fixes that corruption
 * This commit also fixes the behavior of `file:` with no suffix.
   Makes it the same as `file` without the colon.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-27 16:52:56 -05:00
Ralph Castain
ecacde0cd5 Purge whitespace errors
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-23 11:12:14 -07:00
Nathan Hjelm
9c621ad5a4 opal/info: fix abstraction break
The new info infrastructure introduced an abstration break by
including mpi.h and using MPI_ constants in opal. This commit fixes
the break by changing the constants to their opal equivalents.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-23 08:03:01 -06:00
Nathan Hjelm
ffd8ee2dfd opal: use opal_list_t convienience macros
This commit cleans up code in opal to use OPAL_LIST_FOREACH(_SAFE),
OPAL_LIST_DESTRUCT, and OPAL_LIST_RELEASE.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-20 12:37:12 -06:00
KAWASHIMA Takahiro
3afc61644d opal/util: Get rid of \0 from abort delay message
My recent commit 6b91edd had this bug.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-19 20:08:34 +09:00
KAWASHIMA Takahiro
6b91eddc8b Apply opal_abort_delay to the signal handler
This commit expands the effect of the MCA parameter `opal_abort_delay`
to the OPAL signal handler. This allows attaching of a debugger on
segmentation fault etc. before quitting the job.

The sleep code is moved to the `opal_delay_abort` function from the
`ompi_mpi_abort` and `oshmem_shmem_abort` functions for code cleanup.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-08 19:34:48 +09:00
Joshua Hursey
fce28c31d0 opal/stacktrace: Fix stderr target for opal_stacktrace_output
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-05-22 13:46:02 -05:00
Mark Allen
482d84b6e5 fixes for Dave's get/set info code
The expected sequence of events for processing info during object creation
is that if there's an incoming info arg, it is opal_info_dup()ed into the obj
at obj->s_info first. Then interested components register callbacks for
keys they want to know about using opal_infosubscribe_infosubscribe().

Inside info_subscribe_subscribe() the specified callback() is called with
whatever matching k/v is in the object's info, or with the default. The
return string from the callback goes into the new k/v stored in info, and
the input k/v is saved as __IN_<key>/<val>. It's saved the same way
whether the input came from info or whether it was a default. A null return
from the callback indicates an ignored key/val, and no k/v is stored for
it, but an __IN_<key>/<val> is still kept so we still have access to the
original.

At MPI_*_set_info() time, opal_infosubscribe_change_info() is used. That
function calls the registered callbacks for each item in the provided info.
If the callback returns non-null, the info is updated with that k/v, or if
the callback returns null, that key is deleted from info. An __IN_<key>/<val>
is saved either way, and overwrites any previously saved value.

When MPI_*_get_info() is called, opal_info_dup_mpistandard() is used, which
allows relatively easy changes in interpretation of the standard, by looking
at both the <key>/<val> and __IN_<key>/<val> in info. Right now it does
  1. includes system extras, eg k/v defaults not expliclty set by the user
  2. omits ignored keys
  3. shows input values, not callback modifications, eg not the internal values

Currently the callbacks are doing things like
    return some_condition ? "true" : "false"
that is, returning static strings that are not to be freed. If the return
strings start becoming more dynamic in the future I don't see how unallocated
strings could support that, so I'd propose a change for the future that
the callback()s registered with info_subscribe_subscribe() do a strdup on
their return, and we change the callers of callback() to free the strings
it returns (there are only two callers).

Rough outline of the smaller changes spread over the less central files:
  comm.c
    initialize comm->super.s_info to NULL
    copy into comm->super.s_info in comm creation calls that provide info
    OBJ_RELEASE comm->super.s_info at free time
  comm_init.c
    initialize comm->super.s_info to NULL
  file.c
    copy into file->super.s_info if file creation provides info
    OBJ_RELEASE file->super.s_info at free time
  win.c
    copy into win->super.s_info if win creation provides info
    OBJ_RELEASE win->super.s_info at free time

  comm_get_info.c
  file_get_info.c
  win_get_info.c
    change_info() if there's no info attached (shouldn't happen if callbacks
      are registered)
    copy the info for the user

The other category of change is generally addressing compiler warnings where
ompi_info_t and opal_info_t were being used a little too interchangably. An
ompi_info_t* contains an opal_info_t*, at &(ompi_info->super)

Also this commit updates the copyrights.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-05-17 01:12:49 -04:00
David Solt
50aa143ab6 Major structural changes to data types: .super infosubscriber
ompi_communicator_t, ompi_win_t, ompi_file_t all have a super class of type opal_infosubscriber_t instead of a base/super type of opal_object_t (in previous code comm used c_base, but file used super).  It may be a bit bold to say that being a subscriber of MPI_Info is the foundational piece that ties these three things together, but if you object, then I would prefer to turn infosubscriber into a more general name that encompasses other common features rather than create a different super class.  The key here is that we want to be able to pass comm, win and file objects as if they were opal_infosubscriber_t, so that one routine can heandle all 3 types of objects being passed to it.

MPI_INFO_NULL is still an ompi_predefined_info_t type since an MPI_Info is part of ompi but the internal details of the underlying information concept is part of opal.

An ompi_info_t type still exists for exposure to the user, but it is simply a wrapper for the opal object.

Routines such as ompi_info_dup, etc have all been moved to opal_info_dup and related to the opal directory.

Fortran to C translation tables are only used for MPI_Info that is exposed to the application and are therefore part of the ompi_info_t and not the opal_info_t

The data structure changes are primarily in the following files:

    communicator/communicator.h
    ompi/info/info.h
    ompi/win/win.h
    ompi/file/file.h

The following new files were created:

    opal/util/info.h
    opal/util/info.c
    opal/util/info_subscriber.h
    opal/util/info_subscriber.c

This infosubscriber concept is that communicators, files and windows can have subscribers that subscribe to any changes in the info associated with the comm/file/window.  When xxx_set_info is called, the new info is presented to each subscriber who can modify the info in any way they want.  The new value is presented to the next subscriber and so on until all subscribers have had a chance to modify the value.  Therefore, the order of subscribers can make a difference but we hope that there is generally only one subscriber that cares or modifies any given key/value pair.  The final info is then stored and returned by a call to xxx_get_info.

The new model can be seen in the following files:

    ompi/mpi/c/comm_get_info.c
    ompi/mpi/c/comm_set_info.c
    ompi/mpi/c/file_get_info.c
    ompi/mpi/c/file_set_info.c
    ompi/mpi/c/win_get_info.c
    ompi/mpi/c/win_set_info.c

The current subscribers where changed as follows:

    mca/io/ompio/io_ompio_file_open.c
    mca/io/ompio/io_ompio_module.c
    mca/osc/rmda/osc_rdma_component.c (This one actually subscribes to "no_locks")
    mca/osc/sm/osc_sm_component.c (This one actually subscribes to "blocking_fence" and "alloc_shared_contig")

Signed-off-by: Mark Allen <markalle@us.ibm.com>

Conflicts:
	AUTHORS
	ompi/communicator/comm.c
	ompi/debuggers/ompi_mpihandles_dll.c
	ompi/file/file.c
	ompi/file/file.h
	ompi/info/info.c
	ompi/mca/io/ompio/io_ompio.h
	ompi/mca/io/ompio/io_ompio_file_open.c
	ompi/mca/io/ompio/io_ompio_file_set_view.c
	ompi/mca/osc/pt2pt/osc_pt2pt.h
	ompi/mca/sharedfp/addproc/sharedfp_addproc.h
	ompi/mca/sharedfp/addproc/sharedfp_addproc_file_open.c
	ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
	ompi/mpi/c/lookup_name.c
	ompi/mpi/c/publish_name.c
	ompi/mpi/c/unpublish_name.c
	opal/mca/mpool/base/mpool_base_alloc.c
	opal/util/Makefile.am
2017-05-12 14:41:05 -04:00
Nathaniel Graham
01312b2f90 Additional mpirun --help changes
This commit recategorizes several mpirun arguments,
and moves the information for mpirun --help arguments
to the bottom of the general help message.  I also
added the OPAL_CMD_LINE_OTYPE field to two commands
that were missed initially because they were not
in the same area as the others.

Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>
2017-04-19 11:43:45 -06:00
Ralph Castain
dadc924cde Cleanup warnings when timing is not enabled
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-11 17:29:27 -07:00
Artem Polyakov
4477b87e1d Merge pull request #3303 from karasevb/timing2/master
OMPI timings
2017-04-11 07:52:40 -07:00
Boris Karasev
d132eab4a5 ompi/timings: fixed the error of opal timings env import
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-04-11 12:08:48 +06:00
Ralph Castain
95ae0d1df3 Cleanup timing macros for portability across compilers. Rename the --enable-timing configure option to be --enable-pmix-timing so it doesn't pickup external timing requests. Remove a stale function reference in PMIx so it can compile with timing enabled.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-10 12:56:38 +06:00
Boris Karasev
36a0e71f2d ompi/timings: preparing to production state
Adds:
- enabling/disabling of timings throught environment variable `OMPI_TIMING_ENABLE`
- output format: [file name]:[function name]:[description]: avg/min/max
- dynamically extending array of results for case then inited size was exhausted
- catch and collect errors
- cleanup

Note:
For use feature need to configure with `--enable-timings`
and set env `OMPI_TIMING_ENABLE = 1`

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-04-07 21:16:57 +06:00
Artem Polyakov
45898a9c65 opal/timing: add the draft of env-based timings
This commit adds new timing feature that uses environment variables to
expose timing information. This allows easy access to this data (if
timing is enabled) from any other part of the application for the subsequent
postprocessing.
In particular this will be integrated with OMPI-level timing framework that
whill use MPI_Reduce functionality to provide more compact and easy-to use
information.

This commit also adds the example of usage of this framework by annotating
rte_init function. The result is not used anywhere for now. It will be
postprocessed in subsequent commits.

NOTE: that functionality is currently disabled untill it will be verified at runtime

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-04-07 21:16:22 +06:00
Artem Polyakov
88ed79ea25 opal/timing: remove old framework
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-04-07 21:16:22 +06:00
Nathaniel Graham
36d660e07a Add parsable option to help arguments
This commit adds a "parsable" option to the help
arguments, which prints out a machine readable
list of all the mpirun options.

Fixes #3279

Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>
2017-04-05 17:01:43 -06:00
Mark Allen
655a06f559 IB fork
The key change was in btl_openib_connect_udcm.c where a buffer was
being pinned with size 65664 (whether openib was being used or not).
The start of the buffer was page aligned, but because of the size
the end wasn't. That makes it too easy for a forked child to accidentally
touch pinned memory on the same page as the end of that buffer.

So this change increases the size of the allocated buffer to use the
rest of the page.

I inspected the rest of the ibv_reg_mr() calls and changed one other
place to page align its buffer too, although I think the above is
the one that really matters.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-04-05 17:35:52 -04:00
Nathaniel Graham
19e5d15491 mpirun --help output revamp
This commit modifies the output from the mpirun --help
command.  The options have been split into groups, to
make the output smaller and more readable.  The groups
are: general, debug, output, input, mapping, ranking,
binding, devel, compatibility, launch, dvm, and
unsupported. There is also a special "full" command
that can be used to get the old behaviour of printing
out all of the options.  Unsupported options may only
be seen with this full output.

This commit also adds a special case for the help
argument.  It makes it possible for the user to
enter 0 or 1 arguments instead of having to always
enter an argument.  This defaults to printing out
the "general" help options so the user can then
see what help arguments there are.

Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>
2017-04-04 10:59:32 -06:00
George Bosilca
b0f8d2c460
Never free the statically allocated buffer.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-03-01 13:21:03 -05:00
Jeff Squyres
fec519a793 hwloc: rename opal/mca/hwloc/hwloc.h -> hwloc-internal.h
Per a prior commit, the presence of "hwloc.h" can cause ambiguity when
using --with-hwloc=external (i.e., whether to include
opal/mca/hwloc/hwloc.h or whether to include the system-installed
hwloc.h).

This commit:

1. Renames opal/mca/hwloc/hwloc.h to hwloc-internal.h.
2. Adds opal/mca/hwloc/autogen.options to tell autogen.pl to expect to
   find hwloc-internal.h (instead of hwloc.h) in opal/mca/hwloc.
3. s@opal/mca/hwloc/hwloc.h@opal/mca/hwloc/hwloc-internal.h@g in the
   rest of the code base.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-28 07:48:42 -08:00
Jeff Squyres
45b791542c Merge pull request #2809 from jjhursey/fix/ibm/opal-verbose
opal/output: Make sure verbose gets updated when id 0 gets updated.
2017-01-31 12:18:38 -05:00
Josh Hursey
2e64bf42fb Merge pull request #2810 from jjhursey/fix/ibm/stdiag-to-stdout
Extend options for stddiag routing
2017-01-26 14:29:16 -06:00
Jeff Squyres
2c277a66fd Merge pull request #2772 from jjhursey/topic/stacktrace-improv
master: opal/stacktrace improvements
2017-01-26 10:48:41 -08:00
Joshua Hursey
6d98559be9 stacktrace: Add flexibility in stacktrace ouptut
- New MCA option: opal_stacktrace_output
   - Specifies where the stack trace output stream goes.
   - Accepts: none, stdout, stderr, file[:filename]
   - Default filename 'stacktrace'
     - Filename will be `stacktrace.PID`, or if VPID is available,
       then the filename will be `stacktrace.VPID.PID`
 - Update util/stacktrace to allow for different output avenues
   including files. Previously this was hardcoded to 'stderr'.
 - Since opal_backtrace_print needs to be signal safe, passing it a
   FILE object that actually represents a file stream is difficult. This
   is because we cannot open the file in the signal handler using
   `fopen` (not safe), but have to use `open` (safe). Additionally, we
   cannot use `fdopen` to convert the `int fd` to a `FILE *fh` since it
   is also not signal safe.
   - I did not want to break the backtrace.h API so I introduced a new
     rule (documented in `backtrace.c`) that if the `FILE *file`
     argument is `NULL` then look for the `opal_stacktrace_output_fileno`
     variable to tell you which file descriptor to use for output.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-26 11:55:32 -06:00
Joshua Hursey
f8918e37a9 opal/stacktace: Raise the signal after processing
- This prevents us for accidentally masking a signal that was meant to
   terminate the application.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-26 11:55:28 -06:00
Joshua Hursey
dcd9801f7c orte/iof: Add orte_map_stddiag_to_stdout option
* Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag`
   to `stdout` instead of `stderr`.
 * Add protection so that the user canot supply both:
   - `orte_map_stddiag_to_stderr`
   - `orte_map_stddiag_to_stdout`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:22:59 -06:00
Joshua Hursey
2596983593 opal/output: Make sure verbose gets updated when id 0 gets updated.
- This allows the following MCA option to have an impact on the
   framework verbose output as well.
   * `-mca mca_base_verbose stdout`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:14:11 -06:00
Gilles Gouaillardet
1a6c17ec7d opal/util: plug a memory leak
by using opal_setenv() instead of putenv()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
dffaad9de2 opal/util: fix a race condition in opal_os_dirpath_create()
always check the permissions of the created directory,
in case some one else created the very same directory but
with incompatible permissions

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-19 14:02:47 +09:00
Ralph Castain
6da4dbbb33 Quick fix: save the errno from the mkdir call as the call to stat will likely overwrite it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-18 15:42:31 -08:00
Ralph Castain
b257c32d2c Cleanup the os_dirpath logic so it doesn't error out if the directory actually gets created (regardless of what mkdir returns), and pretty-prints the error if it does error out.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-18 12:05:47 -08:00
Gilles Gouaillardet
a3f21fb2aa opal_os_dirpath_create: fix TOCTOU
as reported by Coverity with CID 70396

(cherry picked from commit 58d1b3f4d0)
2017-01-18 11:48:30 -08:00
Ralph Castain
9eab9a1ed3 Remove stale global variables
Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers.

Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation).

Begin first cut at memory profiler

Some minor cleanups of memprobe

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-02 14:04:24 -08:00
Gilles Gouaillardet
c9aeccb84e opal/if: open the if framework once in opal_init_util
the if framework is no more open in opal_if*, which plugs
several memory leaks

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 14:24:30 +09:00
Gilles Gouaillardet
8fd1c3f0df opal/util: handle a race condition in opal_os_dirpath_destroy
An file might have been destroyed by an other task between
readdir() and stat(), so simply ignore stat() failure.

That typically occurs when one task is removing the job_session_dir
and an other task is still removing its proc_session_dir.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-24 10:45:48 +09:00
Gilles Gouaillardet
eaee1332e1 opal/util/ethtool: add missing headers
and get Open MPI build on OpenBSD 6.0
2016-09-23 11:22:19 +09:00
Gilles Gouaillardet
e6f7facd7d opal/util: improve error message in opal_os_dirpath_create() 2016-09-18 17:10:47 +09:00
Gilles Gouaillardet
4b47daeeb0 opal/util: improve return status of opal_os_dirpath_create() 2016-09-18 12:32:42 +09:00
Gilles Gouaillardet
277c319389 opal/util: fix (again and again) incorrect type casting in opal_path_df
and silence CID 1371767

this fixes previous commits :
 - open-mpi/ompi@2eec8970ff
 - open-mpi/ompi@a439afce5b
2016-08-26 09:42:45 +09:00
Gilles Gouaillardet
2eec8970ff opal/util: fix (again) incorrect type casting in opal_path_df
this fixes previous commit open-mpi/ompi@a439afce5b
2016-08-24 12:50:15 +09:00
Gilles Gouaillardet
a439afce5b opal/util: fix incorrect type casting in opal_path_df 2016-08-24 10:26:13 +09:00
Ralph Castain
0e58609327 Fix a bug where we were requiring that all paths in $PATH be absolute. Some users provide relative paths in their environment, and we should respect those. 2016-08-12 11:28:57 -07:00
Gilles Gouaillardet
13009aa290 opal/alfg: have opal_random() wrapper always return a positive int 2016-08-09 17:12:30 +09:00
Thananon Patinyasakdikul
b3e9dadff2 libevent: use opal_random() instead of rand(3)
This commits changed rand(3) and family in libevent to use internal
random function provided in opal to prevent pertubing user's random seed.

Fixes open-mpi/ompi#1877
2016-08-03 09:18:12 -07:00
Gilles Gouaillardet
1f651d17c1 opal/util/ethtool: fix (infamous) strncpy usage
the infamous strncpy does not NULL terminate the destination when the buffer is truncated
do it ourself !

fix CID 1362576
2016-06-09 09:54:50 +09:00
Gilles Gouaillardet
5f565dfec3 configury: clean the flex generated .c files 2016-06-01 11:13:31 +09:00
Ralph Castain
42ecffb6d0 Move the registration of MCA params out of the init of the var system - put them in with the rest of the OPAL MCA param registrations
Take another shot at untangling the spaghetti

orterun: fix for command line parsing

orte-submit calls opal_init_util () before parsing out MCA command line
options (-mca, -am, etc). This prevents mpirun from setting opal MCA
variables for some frameworks as well as the MCA base. This is because
when a framework is opened all of its variables are set to read-only.
Eventually we want to lift this restriction on some MCA variables but
since -mca is affected we must parse out the MCA command line options
before opal_init_util(). This commit fixes the bug by adding a new
option to opal_cmd_line_parse (ignore unknown option) so orte-submit
can pre-parse the command line for MCA options.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>

Minor cleanups to avoid releasing/recreating the cmd line
2016-05-20 09:59:50 -07:00
Gilles Gouaillardet
a01a5487a8 opal/util/ethtool: use system ethtool_cmd_speed when available
Refs: open-mpi/ompi#1679
2016-05-20 09:05:09 +09:00
Jeff Squyres
87233aae49 ethtool: better handle portability
Be sure to handle the case where we don't have ethtool support at all.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-19 10:57:14 -07:00
Gilles Gouaillardet
fd93d236b1 opal/util/ethtool: fix compilation on older Linux when struct ethtool_cmd has no speed_hi field
Refs: open-mpi/ompi#1628
2016-05-19 11:58:04 +09:00
Karol Mroz
31e33a64f9 opal/util: add function to obtain interface speed
If kernel ethtool_cmd_speed() is not available, use copies if possible.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:51 +02:00
Jeff Squyres
265e5b9795 Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1
ompi/opal/orte/oshmem/test: max hostname length cleanup
2016-05-02 09:44:18 -04:00
Ralph Castain
6ac7929bd0 Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need.
Cleanups per @jjhursey review
2016-05-01 11:30:25 -07:00
Karol Mroz
e1c64e6e59 opal: standardize on max hostname length
Define OPAL_MAXHOSTNAMELEN to be either:
  (MAXHOSTNAMELEN + 1) or
  (limits.h:HOST_NAME_MAX + 1) or
  (255 + 1)

For pmix code, define above using PMIX_MAXHOSTNAMELEN.

Fixup opal layer to use the new max.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-24 08:19:47 +02:00
Nathan Hjelm
27f8a4e806 opal: add code patcher framework
This commit adds a framework to abstract runtime code patching.
Components in the new framework can provide functions for either
patching a named function or a function pointer. The later
functionality is not being used but may provide a way to allow memory
hooks when dlopen functionality is disabled.

This commit adds two different flavors of code patching. The first is
provided by the overwrite component. This component overwrites the
first several instructions of the target function with code to jump to
the provided hook function. The hook is expected to provide the full
functionality of the hooked function.

The linux patcher component is based on the memory hooks in ucx. It
only works on linux and operates by overwriting function pointers in
the symbol table. In this case the hook is free to call the original
function using the function pointer returned by dlsym.

Both components restore the original functions when the patcher
framework closes.

Changes had to be made to support Power/PowerPC with the Linux
dynamic loader patcher. Some of the changes:

 - Move code necessary for powerpc/power support to the patcher
   base. The code is needed by both the overwrite and linux
   components.

 - Move patch structure down to base and move the patch list to
   mca_patcher_base_module_t. The structure has been modified to
   include a function pointer to the function that will unapply the
   patch. This allows the mixing of multiple different types of
   patches in the patch_list.

 - Update linux patching code to keep track of the matching between
   got entry and original (unpatched) address. This allows us to
   completely clean up the patch on finalize.

All patchers keep track of the changes they made so that they can be
reversed when the patcher framework is closed.

At this time there are bugs in the Linux dynamic loader patcher so
its priority is lower than the overwrite patcher.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:16:13 -06:00
Nathan Hjelm
4cac623aeb opal/patch: add call to check if binary patching is supported
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:16:12 -06:00
Nathan Hjelm
7aa03d66b3 opal/memory: add support for patch based memory hooks
This commit adds support for runtime binary patching. The support is
broken down into two parts: util/opal_patcher.[ch] which contains the
functionality for runtime patching of symbols, and mca/memory/patcher
which patches the various symbols needed to provide support for memory
hooks. This work is preliminary and is based off work donated by IBM.

The patcher code is disabled if dlopen is disabled.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:14:31 -06:00
Ralph Castain
8c14df2328 Revert "Modify singularity support per patch from Greg Kurtzer"
This reverts commit open-mpi/ompi@f7257a8310.

Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable.

Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl
2016-03-24 11:27:18 -07:00
Nathan Hjelm
607be72de9 opal/keval_parse: fix conditional ordering
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-08 10:06:14 -07:00
Nathan Hjelm
63bac9a4e0 opal/util: fix bug in key value parser
This commit fixes a bug in the opal key value parser that might cause
the filename parser to go past the beginning of the string.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-07 14:51:29 -07:00
Nathan Hjelm
32236736a4 Fix parsing of envvars in MCA files
This commit fixes a memory corruption bug when parsing lines of the
form:

-x FOO=bar

The code was making changes to the size of the buffer allocated for
key_buffer without making the appropriate changes to
key_buffer_len. This was causing subsequent calls to save_param_name
to write to invalid memory.

This commit makes the following changes:

  - Fix the above bug by modifying trim_name to move the string within
    the buffer instead of re-allocating space for the trimmed string.

  - Cleaned up both trim_name and save_param_name. Both functions took
    a prefix and suffix to trim. Problem was the prefix was not
    treated like a prefix. Instead the "prefix" was located inside the
    string using strstr then the trimmed value started after the
    substring (even in the middle of the string). To allow trimming
    both -x and --x (as well as -mca and --mca) trim_name is now
    called with each prefix.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-17 14:58:05 -07:00
Ralph Castain
06c3dfc052 Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP.
* Clean up the DVM so it continues to run even when applications error out and we would ordinarily abort the daemons.
* Create a new errmgr component for the DVM to handle the differences.
* Cleanup the DVM state component.
* Add ORTE bindings directory and brief README
* Pass a local tool index around to match jobs.
* Pass the jobid on job completion.
* Fix initialization logic.
* Add framework for python wrapper.
* Fix terminate-with-non-zero-exit behavior so it properly terminates only the indicated procs, notifies orte-submit, and orte-dvm continues executing.
* Add some missing options to orte-dvm
* Fix a bug in -host processing that caused us to ignore the #slots designator. Add a new attribute to indicate "do not expand the DVM" when submitting job spawn requests.
* It actually makes no sense that we treat the termination of all children differently than terminating the children of a specific job - it only creates confusion over the difference in behavior. So terminate children the same way regardless.

Extend the cmd_line utility to easily allow layering of command line definitions

Catch up with ORTE interface change and make build more generic.

Disable "fixed dvm" logic for now.

Add another cmd_line function to merge a table of cmd line options with another one, reporting as errors any duplicate entries. Use this to allow orterun to reuse the orted_submit code

Fix the "fixed_dvm" logic by ensuring we reset num_new_daemons to zero. Also ensure that the nidmap is sent with the first job so the downstream daemons get the node info. Remove a duplicate cmd line entry in orterun.

Revise the DVM startup procedure to pass the nidmap only once, at the startup of the DVM. This reduces the overhead on each job launch and ensures that the nidmap doesn't get overwritten.

Add new commands to get_orted_comm_cmd_str().

Move ORTE command line options to orte_globals.[ch].

Catch up with extra orte_submit_init parameter.

Add example code.

Add documentation.

Bump version.

The nidmap and routing data must be updated prior to propagating the xcast or else the xcast will fail.

Fix the return code so it is something more expected when an error occurs. Ensure we get an error returned to us when we fail to launch for some reason. In this case, we will always get a launch_cb as we did indeed attempt to spawn it. The error code will be returned in the complete_cb.

Fix the return code from orte_submit_job - it was returning the tracker index instead of "success". Take advantage of ORTE's pretty-print capabilities to provide a nice error output explaining why we failed to launch. Ensure we always get a launch_cb when we fail to launch, but no complete_cb as the job never launched.

Extend the error reporting capability to job completion as well.

Add index parameter to orte_submit_job().

Add orte_job_cancel and implement ORTE_DAEMON_TERMINATE_JOB_CMD.

Factor out dvm termination.

Parse the terminate option at tool level.

Add error string for ORTE_ERR_JOB_CANCELLED.

Add some safeguards.

Cleanup and/of comments.

Enable the return.

Properly ORTE_DECLSPEC orte_submit_halt.

Add orte_submit_halt and orte_submit_cancel to interface.

Use the plm interface to terminate the job
2016-02-13 08:10:44 -08:00
Edgar Gabriel
722aab92e6 - extend opal_path_nfs to retrieve the file system type
- use opal_path_nfs in the fs_base function to avoid code duplication.
2016-01-26 13:36:21 -06:00
Ralph Castain
4dad5de8ff Silence a couple of warnings - strncpy returns a char*, not an int 2016-01-16 09:44:52 -08:00
Gilles Gouaillardet
1d38430e43 opal: replace opal_convert_jobid_to_string with opal_snprintf_jobid 2016-01-14 10:39:03 +09:00
KAWASHIMA Takahiro
2dcb2d711b Makefile: Move fd.c to SOURCES from headers.
And reorder fd.h and few.h in alphabetical order.
2015-11-04 11:28:43 +09:00
Nathan Hjelm
6d3041335f opal/keyval: reset buffer pointer/size in finalize
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-20 13:10:44 -06:00
Gilles Gouaillardet
dc883cff8d opal/util: fix parse_ipv4_dots prototype 2015-10-01 14:03:08 +09:00
Nathan Hjelm
408da16d50 ompi/proc: add proc hash table for ompi_proc_t objects
This commit adds an opal hash table to keep track of mapping between
process identifiers and ompi_proc_t's. This hash table is used by the
ompi_proc_by_name() function to lookup (in O(1) time) a given
process. This can be used by a BTL or other component to get a
ompi_proc_t when handling an incoming message from an as yet unknown
peer.

Additionally, this commit adds a new MCA variable to control the new
add_procs behavior: mpi_add_procs_cutoff. If the number of ranks in
the process falls below the threshold a ompi_proc_t is created for
every process. If the number of ranks is above the threshold then a
ompi_proc_t is only created for the local rank. The code needed to
generate additional ompi_proc_t's for a communicator is not yet
complete.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:54 -06:00
Ralph Castain
d97bc29102 Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given 2015-09-04 16:54:40 -07:00
Ralph Castain
0d5814b5ca Cleanup Coverity issues 2015-08-29 21:19:27 -07:00
Ralph Castain
cf6137b530 Integrate PMIx 1.0 with OMPI.
Bring Slurm PMI-1 component online
Bring the s2 component online

Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.

Bring the OMPI pubsub/pmi component online

Get comm_spawn working again

Ensure we always provide a cpuset, even if it is NULL

pmix/cray: adjust cray pmix component for pmix

Make changes so cray pmix can work within the integrated
ompi/pmix framework.

Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet

Cleanup comm_spawn - procs now starting, error in connect_accept

Complete integration
2015-08-29 16:04:10 -07:00
Ralph Castain
023936e84b Silence coverity warnings 2015-07-29 07:28:08 -07:00
Ralph Castain
8d128fe090 Remove the non-null attributes from the cmd_line parser as this isn't something we can guarantee, and the optimization isn't worth the potential for error 2015-06-25 13:26:20 -07:00
Nathan Hjelm
4d92c9989e more c99 updates
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-25 10:14:13 -06:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Gilles Gouaillardet
58d1b3f4d0 opal_os_dirpath_create: fix TOCTOU
as reported by Coverity with CID 70396
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
de66447ebb opal_cmd_line_get_usage_msg: silence warning
as reported by Coverity with CID 1269967
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
f2f66e6e63 opal_daemon_init: silence warning
as reported by Coverity with CID 710642
2015-06-17 11:17:53 +09:00
Gilles Gouaillardet
8427e87ee9 opal_argv_delete: silence warning
as reported by Coverity with CID 71914
2015-06-17 11:17:53 +09:00