1
1
Граф коммитов

4257 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
fe1c6bd881 Merge pull request #2840 from hjelmn/event_fix
verbs: remove extra event user increment/decrement operation
2017-01-26 07:30:24 -08:00
Ralph Castain
399de0738e Cleanup launch
Given that we only set OOB contact info from inside of events, or before we begin threaded operations (e.g., in the ess), allow set_contact_info to directly update the oob/base framework globals.

Correct the nidmap regex decompression routine.

Ensure that rank=1 daemon always sends back its topology as this is the most common use-case.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 22:06:09 -08:00
Nathan Hjelm
9f28c0af39 verbs: remove extra event user increment/decrement operation
Since the oob and connections systems do not work the same way they
did in older versions of Open MPI these operations are no longer
necessary. At best they do nothing and at worst they hurt performance
by making us enter the event library more often in opal_progress().

Fixes #2839

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-25 18:37:06 -07:00
Ralph Castain
2f4e87eae9 Have rank=1 daemon always send its topology back as this is the most common use-case
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 09:33:11 -08:00
Jeff Squyres
230bbc597d plm base: make sure to assign "node" early enough
Make sure to assign "node" before using it in ORTE_FLAG_SET.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-25 08:02:59 -08:00
Ralph Castain
184ccc8e91 Cleanup some code so it is clear that it is executing in an event. Ensure that peer event base is properly set on incoming connections
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 06:55:11 -08:00
Gilles Gouaillardet
ef10d3fd7b orte: add missing include file
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:15:20 +09:00
Joshua Hursey
0e9a06d2c3 orte/iof: Add app stderr to stdout redirection at source
* Add an MCA parameter to combine stdout and stderr at the source
   - `iof_base_redirect_app_stderr_to_stdout`
 * Aids in user debugging when using libraries that mix stderr with stdout

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:23:48 -06:00
Joshua Hursey
dcd9801f7c orte/iof: Add orte_map_stddiag_to_stdout option
* Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag`
   to `stdout` instead of `stderr`.
 * Add protection so that the user canot supply both:
   - `orte_map_stddiag_to_stderr`
   - `orte_map_stddiag_to_stdout`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:22:59 -06:00
Ralph Castain
ef86707fbe Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it.
Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 13:33:22 -08:00
Ralph Castain
4e9364b9a4 Merge pull request #2794 from rhc54/topic/regs
Next step in reducing launch time
2017-01-24 03:19:57 -08:00
Ralph Castain
86ab751c5e Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 19:54:47 -08:00
Gilles Gouaillardet
0bdc594b2e rml/base: plug a memory leak in orte_rml_API_recv_cancel()
simply return when the orte event thread has gone

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Ralph Castain
a61f7bdb26 Merge pull request #2780 from rhc54/topic/conn
Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
2017-01-23 06:40:28 -08:00
Ralph Castain
e7b12913b4 Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 04:00:24 -08:00
Nathan Hjelm
954a4b7be3 oob/base: fix num_threads registration type
This commit fixes a bug in the registration of the num_threads MCA
variable. The variable is of type int and was being registered as
a boolean.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-22 14:02:34 -07:00
Ralph Castain
ac4fcd3f97 Ensure that oob/base level data is always accessed in the oob/base event thread. Make debruijn the default routed component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 10:33:32 -08:00
Ralph Castain
6560617c04 Fix comm_spawn and orte-dvm by resetting all used "node mapped" flags after building the child list
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 05:55:53 -08:00
Ralph Castain
639cdd4f9d Add missing flag set to ensure nodes do not get double-added to job map.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 20:06:50 -08:00
Ralph Castain
be3ef77739 Improve packing efficiency by raising the initial buffer size and modifying the extension code. Flag if a job map has had its nodes added so we don't have to loop repeatedly to check it.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 14:03:19 -08:00
Ralph Castain
466cbd4d29 Rework the threading in oob/tcp so that daemons (including mpirun) use multiple progress threads to get messages out to their children, and so that the oob/base uses a separate one to setup sends. This allows the daemon cmd processor to execute in parallel with relay of messages, which significantly reduces launch times at scale
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 13:26:19 -08:00
Ralph Castain
668421b6ec Compress the xcast message if bigger than a defined size to further improve launch performance at scale
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 22:08:02 -08:00
Ralph Castain
1f46e48b94 Have mpirun and orteds activate the oob/tcp progress thread by default, leaving a way to turn it off via MCA param. Provide a method by which the add_procs command can be processed in parallel with relaying the cmd message to the next daemons down the tree.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 18:52:58 -08:00
Ralph Castain
bb132f6d03 Merge pull request #2764 from rhc54/topic/dvm
If a tool sees the HNP it is attached to die (thereby losing connecti…
2017-01-19 15:39:30 -08:00
Ralph Castain
ca50b31de1 Merge pull request #2762 from rhc54/topic/oobfast
Speed-up the OOB/TCP communications by using writev instead of writing the header, and then separately write the body
2017-01-19 15:39:06 -08:00
Ralph Castain
19bb64cfb8 If a tool sees the HNP it is attached to die (thereby losing connection), then stop the event loop instead of going through the abort code path. This will allow the tool to cleanup before exiting
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 14:04:06 -08:00
Ralph Castain
e5f687f896 Speed-up the OOB/TCP communications by using writev instead of writing the header, and then separately write the body
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 13:03:44 -08:00
Ralph Castain
368684bd63 Revert e9bc293 and try a different approach for scalably dealing with hetero clusters. Have each orted send back its topo "signature". If mpirun detects that this signature has not been seen before, then ask for that daemon to send back its full topology description. This allows the system to only get the topology once for each unique topo in the cluster.
Cleanup a typo, and remove no longer needed MCA params for hetero nodes and hetero apps. Hetero nodes will always be automatically detected. We don't support a mix of 32 and 64 bit apps

Modify the orte_node_t to use orte_topology_t instead of hwloc_topology_t, updating all the places that use it. Ensure that we properly update topology when we see a different one on a compute node.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-18 10:22:15 -08:00
Ralph Castain
e9bc2934be Add an MCA param "hnp_on_smgmt_node" that mpirun can use to tell the orteds to ignore its topology signature as mpirun is executing on a system mgmt node, and hence a different topology than the compute nodes
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-16 19:32:01 -08:00
Ralph Castain
74a285be83 Cancel the waitpid callback once the waitpid on a process has fired to avoid multiple notifications
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-16 14:32:02 -08:00
Ralph Castain
9e8c7d6295 Silence Coverity warning
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-15 07:51:37 -08:00
Ralph Castain
6b34cc67d6 Correct typo
Fixes #2691

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-15 07:48:31 -08:00
Ralph Castain
3a157f0496 One more time - we "push" IOF for stdout, stderr, and stddiag with separate calls. However, we were creating the sinks for all three of them each time, which caused them to leak. Create the sinks only once for each channel.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-14 17:40:36 -08:00
Ralph Castain
b55c03255a Strange - I had created a new IOF API "complete" for cleaning up at the end of jobs, but somehow the implementation is missing. It also appears that the orted's never actually cleaned up their job-related information. These things are fine for normal mpirun-based operations, but cause significant resource leaks for the DVM.
Complete the implementation and seal the leaks

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-12 19:54:18 -08:00
Ralph Castain
0e2df3be3e Missed one spot - plug fd leaks in orteds
Fixes #2691

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-12 13:45:46 -08:00
Ralph Castain
9ad02b5d13 Merge pull request #2718 from rhc54/topic/leaks
Don't remove the IOF framework's tracking info for a proc until the state machine tells it to do so.
2017-01-12 09:57:17 -08:00
Nathan Hjelm
110840fc87 ess/hnp: add support for forwarding additional signals (#2712)
* ess/hnp: add support for forwarding additional signals

This commit adds support to the hnp ess module to forward additional
signals beyond the default SIGUSR1, SIGUSR2, SIGSTP, and SIGCONT.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

* Generalize this a bit to allow a broader range of signals to be forwarded. Turns out that SIGURG is now a "standard" signal, though the value differs across systems. So setup to forward it (and some friends) if they are defined. Allow users to provide the signal name (instead of the integer value) as the value of even the more common signals does vary across systems. Don't limit the number that can be supported.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>

* ess/hnp: fix some bugs in the signal forwarding code

This commit fixes two bugs:

 - signals_set needs to be set even if no signals are being
   forwarded. If it is not set we will SEGV in libevent if
   ess_hnp_forward_signals == none.

 - SIGTERM and SIGHUP are handled with a different type of handler. Do
   not allow the user to specify these to be forwarded.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

* We are sure to get "dinged" if error messages aren't nicely output via show_help, so do so here

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-12 10:09:41 -07:00
Ralph Castain
fa419d3c0d Don't remove the IOF framework's tracking info for a proc until the
state machine tells it to do so. This plugs leaked file descriptors as
we were losing track prior to destructing the resources.

Fixes #2691

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-12 08:34:29 -08:00
Ralph Castain
aff3a00059 Protect default mapping/binding options for cases where no NUMA or
SOCKET objects exist - like VMs

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-11 09:44:44 -08:00
Ralph Castain
93e4935902 Be a tad more cautious before releasing objects when running in DVM mode
Fixes #2700

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-10 14:04:27 -08:00
Gilles Gouaillardet
44c1ff60f1 Merge pull request #2672 from ggouaillardet/topic/misc_memory_leaks
Plug misc memory leaks
2017-01-10 13:16:04 +09:00
Joshua Ladd
3e23380bba Merge pull request #2675 from artpol84/orte/state/exit_1_fix
orte/odls: Fix ORTE state machine for the non-zero exit case
2017-01-09 12:32:37 -05:00
Joshua Ladd
7fc9f9bbac Merge pull request #2620 from karasevb/fix_rmaps_mindist
rmaps/mindist: fix pmix errors
2017-01-06 17:26:48 -05:00
Ralph Castain
684e69695f Minor cleanups to eliminate warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-06 08:44:10 -08:00
Artem Polyakov
3eb6c98542 orte/odls: Fix ORTE state machine for the non-zero exit case
This commit fixes rare race condition that occurs when the process
that is calling `exit(-1)` has delay between fd cleanup and actual
OS-level exit. This may happen if the process has some work to do
`on_exit()`.

**Problem description**:
Consider an application process that has called `exit(nonzero)`, it's
fd's was closed
but it's actual termination at OS level is delayed by some cleanups (eg.
in callbacks registered via `on_exit()`).
Observed sequence of events was the following:

* orted gets stdio disconnection and activating `IOF COMPLETE` state.
* parallel OOB disconnection causes `COMMUNICATION FAILURE` state to be
activated.
* during `COMMUNICATION FAILURE` processing `odls_base_default_wait_local_proc`
is called even though real waitpid wasn't yet called (code mentions that
waitpid might not be called for unspecified reason). Because of that real exit
code is unknown and set to 0. `odls_base_default_wait_local_proc` callback sees
`IOF COMPLETE` flag and in conjunction with 0-exit-code it activates
`WAITPID FIRED` state.
* processing of `WAITPID FIRED` leads to `NORMALLY TERMINATED` to be
activated.
* `NORMALLY TERMINATED` state in particular leads `ORTE_PROC_FLAG_ALIVE` flag
for this proc to be dropped.
* when application process finally exits and `wait_signal_callback` is
launched. It sets real exit code and calls `odls_base_default_wait_local_proc`
again but at this time since the process has `ORTE_PROC_FLAG_ALIVE` flag
dropped `WAITPID FIRED` state is activated (instead of `EXITED WITH NON-ZERO`)
leading to a hang that was observed.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-01-06 11:12:55 +02:00
Gilles Gouaillardet
6b9343a966 plm/rsh: plug a memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:45 +09:00
Gilles Gouaillardet
8ba92d7516 iof/base: plug a memory leak in orte_iof_base_close()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:45 +09:00
Gilles Gouaillardet
7fe6840232 state/hnp: plug a memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:45 +09:00
Gilles Gouaillardet
4d58b8dcae ess/pmi: plug a memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:45 +09:00
Gilles Gouaillardet
c0c5dd8ccc orte: plug a memory leak in orte_rml.recv_cancel
do not invoke orte_rml.recv_cancel after the orte progress thread has gone

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:44 +09:00
Gilles Gouaillardet
17fac4bfd1 grpcomm/base: get rid of the seq_num field of the orte_grpcomm_signature_t struct
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:44 +09:00
Gilles Gouaillardet
fe25f50871 grpcomm/base: plug a memory leak on finalize
manually allocate sequence numbers to be stored into the
orte_grpcomm_base.sig_table hash table, and manually release
them on orte_grpcomm_base_close()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 15:38:44 +09:00
Gilles Gouaillardet
0ee5d56ab1 grpcomm/direct: plug a memory leak in barrier_release()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 13:46:35 +09:00
Gilles Gouaillardet
f2d6584189 grpcomm/base: plug misc memory leaks
- add a destructor to orte_grpcomm_caddy_t in order to plug a memory leak

- plug a memory leak in barrier_release()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 13:46:21 +09:00
Gilles Gouaillardet
58f2a764f9 ess/hnp: plug memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 11:35:59 +09:00
Gilles Gouaillardet
24c61b0625 oob/tcp: plug a memory leak in mca_oob_tcp_component_lost_connection()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 11:35:59 +09:00
Gilles Gouaillardet
c7d9e62d47 rml/base: plug a memory leak
add a destructor to orte_rml_send_request_t in order
to plug a memory leak

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 11:35:59 +09:00
Ralph Castain
6509f60929 Complete the memprobe support. This provides a new scaling tool called "mpi_memprobe" that samples the memory footprint of the local daemon and the client procs, and then reports the results. The output contains the footprint of the daemon on each node, plus the average footprint of the client procs on that node.
Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given.

Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example:

$ mpirun -npernode 2 ./mpi_memprobe
Sampling memory usage after MPI_Init
Data for node rhc001
	Daemon: 12.483398
	Client: 6.514648

Data for node rhc002
	Daemon: 11.865234
	Client: 4.643555

Sampling memory usage after MPI_Barrier
Data for node rhc001
	Daemon: 12.520508
	Client: 6.576660

Data for node rhc002
	Daemon: 11.879883
	Client: 4.703125

Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-05 10:32:17 -08:00
Ralph Castain
9eab9a1ed3 Remove stale global variables
Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers.

Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation).

Begin first cut at memory profiler

Some minor cleanups of memprobe

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-02 14:04:24 -08:00
Ralph Castain
fe68f23099 Only instantiate the HWLOC topology in an MPI process if it actually will be used.
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:

* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
  instead of the topology itself, if available, thus avoiding instantiating the topology

* openib BTL. This uses the distance matrix. At present, I haven't developed a method
  for replacing that reference. Thus, this component will instantiate the topology

* usnic BTL. Uses the distance matrix.

* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
  the topology

* ess base functions. If a process is direct launched and not bound at launch, this
  code attempts to bind it. Thus, procs in this scenario will instantiate the
  topology

Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.

Fix pernode binding policy

Properly handle the unbound case

Correct pointer usage

Do not free static error messages!

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-29 10:33:29 -08:00
Ralph Castain
3a2d6a5ab6 Begin to reduce reliance of application procs on the topology tree itself by having the daemon provide more detailed info. In this case, provide the topology description string so that procs can readily determine the number of types of objects on the node, and a "locality" string that describes which objects this process is executing upon. The latter allows a process to compute the objects of overlap between itself and another proc without consulting the topology tree.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-28 09:14:26 -08:00
Ralph Castain
7866bb1119 Add debug, cleanup cpus/rank
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-27 21:25:52 -08:00
Ralph Castain
1e4bffd937 Fix mapping directive checks
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-27 20:42:47 -08:00
Ralph Castain
791f4f1ce3 Adjust debug output for clarity
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-26 14:04:20 -08:00
Ralph Castain
ef3f748d0d Transfer some minor cleanups back from the PMIx reference server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-23 08:46:04 -08:00
Boris Karasev
5fb3e0a9b6 rmaps/mindist: fix pmix errors
Fixed the case were only part of the nodes in the allocation
are used by the applicaton proccesses.

Force PMIx nodemap key to only contain nodes that are actually
used by the application proccesses.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2016-12-21 06:42:04 +02:00
Ralph Castain
ea133206ec Sync the internal OMPI component to PMIx master
Update external PMIx v2.x component
Add missing Makefile

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-19 19:14:16 -08:00
Ralph Castain
256b5adac5 Transfer across final fixes from debugger attach work
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-19 00:34:27 -08:00
Ralph Castain
c6f6f40529 Transfer debugger support changes
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-17 18:14:46 -08:00
Ralph Castain
269753f5c1 Transfer back changes from debugger attach work
Silence warning

Remove debug

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-17 10:00:52 -08:00
Ralph Castain
215d6290e0 Add a flux component for LLNL
Fine tuning of flux component
Fix a few minor issues with the initial cut:
* Job id could be obtained from the PMI kvsname like SLURM,
  but simpler to getenv (FLUX_JOB_ID)
* Flux pmi-1 doesn't define PMI_BOOL, PMI_TRUE, PMI_FALSE
* Flux pmi-1 maps the deprecated PMI_Get_kvs_domain_id() to
  PMI_KVS_Get_my_name() internally, so just call that instead.
* Drop residual slurm references.

Add wrappers for PMI functions so that if HAVE_FLUX_PMI_LIBRARY
is not defined, the component can dlopen libpmi.so at location
specified by the FLUX_PMI_LIBRARY_PATH env variable, which adds
flexibility.  If HAVE_FLUX_PMI_LIBRARY is defined, link with
libpmi.so at build time in the usual way.

Update configury for flux component

Update m4 so the configure options work as follows:

 --with-flux-pmi
      Build Flux PMI support (default: yes)

 --with-flux-pmi-library
      Link Flux PMI support with PMI library at build
      time. Otherwise the library is opened at runtime at
      location specified by FLUX_PMI_LIBRARY_PATH environment
      variable. Use this option to enable Flux support when
      building statically or without dlopen support (default: no)

If the latter option is provided, the library/header is located at
build time using the pkg-config module 'flux-pmi'.  Otherwise there
is no library/header dependency.

Handle the case where ompi is configured with --disable-dlopen
or --enable-statkc.  In those cases, don't build the component
unless --with-flux-pmi-library is provided.

It is fatal if the user explicitly requests --with-flux-pmi but
it cannot be built (e.g. due to --disable-dlopen).

Add a schizo/flux component

Update schizo/flux component

Eliminate slurm-specific usage cases.

Since the module is only loaded if FLUX_JOB_ID is set, there are
only two cases to handle:

1) App was launched indirectly through mpirun.  This is not yet
supported with Flux, but hook remains in case this mode is supported
in the future.

2) App was launched directly by Flux, with Flux providing
CPU binding, if any.

Fix up white space in pmix/flux component

Drop non-blocking fence from pmix:flux component

The flux PMI-1 library is not thread safe, therefore
register a regular blocking fence callback instead of the
thread-shifting fencenb().

pmix/flux component avoids extra PMI_KVS_Gets

Keys stored into the base cache under the wildcard
rank are not intended to be part of the global key namespace.
These keys therefore should not trigger a PMI_KVS_Get() if they
are not found in the cache.

Minor pmix/flux component cleanup

pmix/flux: drop code for fetching unused pmix_id

pmix/flux: err_exit must return error

Problem: in flux_init(), although 'ret' (variable holding
err_exit return code) is initialized to OPAL_ERROR, the
variable is reused as a temporary result code, so if there are
some successes followed by a failure that doesn't set 'ret',
flux_init() could return success with PMI not initialized.

Ensure that a "goto err_exit" returns OPAL_ERROR if 'ret'
is not set to some other error code.

pmix/flux: don't mix OPAL_ and PMI_ return codes

Problem: flux_init() can return both PMI_ and OPAL_ return
codes.  Although OPAL_SUCCESS and PMI_SUCCESS are both defined
as 0, other codes are not compatible.

Ensure that flux_init() consistently uses 'rc' for PMI_
return codes and 'ret' for OPAL_ return codes.

pmix/flux: factor out repeated code for cache put

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-16 18:26:38 -08:00
Ralph Castain
2af677b1cf Ensure that we don't bind-by-default in an oversubscribed condition
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-15 07:58:52 -08:00
Ralph Castain
884fb7fcf2 Update the PMIx2 support to include the latest shared memory optimizations
Update ORTE support for dynamic PMIx operations e.g., PMIx_Spawn
Update to track master
Ensure that --disable-pmix-dstore actually disables the dstore. Sync to a few debugger updates

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-14 15:00:10 -08:00
Ralph Castain
9f69b0183f Ensure jobs that fail always return a non-zero exit code.
Thanks to Ashley Pittman for the report.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-14 09:41:06 -08:00
rhc54
341ab683de Merge pull request #2532 from rhc54/topic/pmixptl
Update to latest PMIx master + PTL branch
2016-12-07 17:28:22 -08:00
Ralph Castain
e1aa7939ef Correctly cleanup the local children and node map info on remote orteds upon job completion. Ensure that register_nspace only includes procs from that job in the proc map
Thanks to Ashley Pittman for the report

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-07 13:53:00 -08:00
Gilles Gouaillardet
123036dbf8 ess/base: invoke orte_routed.update_routing_plan() earlier
fix an issue that can be evidenced with two nodes
n0$ mpirun --host n1:1 --mca oob_tcp_static_ipv4_ports 1234 -np 1 --mca routed radix --mca oob tcp true

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-07 17:19:25 +09:00
Ralph Castain
fbed2d794a Update to latest PMIx master + PTL branch
Update the usock component to disable it

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-06 20:47:44 -08:00
Ralph Castain
85a634926b Update signal handling to introduce a pause between SIGCONT and SIGTERM, followed by another pause before SIGKILL. Do this within the odls/kill_local_procs function while we know we are blocked in an event, and before the daemon shuts down the event progress loop
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-06 12:34:42 -08:00
Ralph Castain
d8f262e39b Resolve a duplicate symbol issue when the rml/ofi component is enabled
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-05 13:41:38 -08:00
Ralph Castain
79cde184ad Allow a PMIx tool to spawn a job
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-03 16:00:47 -08:00
Ralph Castain
1a0bccb536 Now that PMIx has settled on its release strategy and numbering, update the OPAL pmix framework to track
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-02 15:44:43 -08:00
Ralph Castain
88313debc2 Per discussion on email thread, restore placement of child procs in their own process group so that any signal sent to one of our children is automatically propagated to any child process they might have spawned.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-02 03:36:22 -08:00
Ralph Castain
dd491db21f Fix IOF when outputing to files - the remote orteds were failing to output stdout/err from their procs.
Silence a warning in orted_submit

Protect against a free'd value in an error path when forming oob tcp connections

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-01 14:12:47 -08:00
Artem Polyakov
58300afff2 orte/oob/tcp: Plug the memory leak.
Plug coverity defect CID 1396541.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2016-12-01 06:48:25 +07:00
Ralph Castain
47ed214458 Do not resend if max_retries is exceeded. Make a verbose output available to tell us where the intended message was to go.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-29 19:21:16 -08:00
rhc54
d31f173744 Merge pull request #2476 from rhc54/topic/dbgupdate
Bring forward the debugger-related changes
2016-11-29 19:10:32 -08:00
Ralph Castain
d5fd635efe Bring forward the debugger-related changes
Refs https://github.com/open-mpi/ompi/pull/2425

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-29 13:15:20 -08:00
Ralph Castain
30ff8be9c9 Silence minor warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-29 08:33:22 -08:00
Jeff Squyres
a6d390fe7b Merge pull request #2461 from artpol84/oob/msg_drop
orte/oob/tcp: Fix message dropping in case of concurrent connection.
2016-11-29 11:23:15 -05:00
Ralph Castain
f7699a7eeb Silence warnings in a .opal_ignore'd component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-28 13:18:25 -08:00
Artem Polyakov
ada93e0c02 orte/oob/tcp: Fix message dropping in case of concurrent connection.
The problem was observed for direct modex used with recursive doubling
algorithm (used for collective ID calculation prior to d52a2d081e9598a9ac9a50fb4b013a6d2a72375b)
that has pairwise nature and counter-connections are highly likely.

The following scenario was uncovering the issue:
* ranks `x` and `y` want to communicate with each other, `x` < `y`;
* rank `x` initiates the connection and sends the ack;
* rank `y` starts to `connect()` and gets the ack from `x`;
* `y` identifies that it already started connecting and `y` > `x` so it rejects incoming connection.
* `x` sees that his connection was rejected in `mca_oob_tcp_peer_recv_connect_ack()` when trying to
read the message header using `tcp_peer_recv_blocking()` which calls `mca_oob_tcp_peer_close()`
that effectively flushes all the messages in the peer->send_queue.
* `y` send the ack to `x` and the connection is established, however all the messages for the peer
at `x` are vanished (except the front one in peer->send_msg).

This commit introduces a "nack" function that will be used at `y` side to tell `x` that `y` has the
priority and `x`'s connection should be closed. This allows to avoid "guessing" on the unexpectedly
closed connection.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2016-11-27 04:58:34 +07:00
Howard Pritchard
2cbc0e8472 pmix/cray: fix disable-dlopen problem
PR open-mpi/ompi#2432 introduced a regression where configure
and build with --disable-dlopn caused build failure owing
to unresolved alps lli symbols in the libopal-pal shared library.

This commit fixes this problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-11-21 13:45:10 -06:00
Ralph Castain
eb67c2fd44 Update OFI/rml component - still .opal_ignore'd
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-18 14:54:26 -08:00
Ralph Castain
9c6c2fa61d Bring the v2.0.x debugger patch up to the master branch
Ensure the personality gets set as specified by user, or defaults to
"ompi"

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-18 12:45:45 -08:00
Ralph Castain
188880be3f Since static ports are only used by ORTE if the runtime option is given,
there is no need for a configure option as well - so remove the
--enable-orte-static-ports configure option. When decoding the daemon
nidmap, mark new daemons as ALIVE by default - we will discover dead
ones as we go.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-04 05:01:42 -07:00
Gilles Gouaillardet
da0c873e14 oob/tcp: enhance debugging output
display the hop node used to send a message
(if the message is sent directly, then the hop is the destination)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-04 14:16:06 +09:00
Josh Hursey
b18598f6c7 Merge pull request #2329 from jjhursey/topic/short-hostname-lsf-fix
ras/*: Fix !orte_keep_fqdn_hostnames for RAS components
2016-11-02 10:49:08 -05:00
Ralph Castain
435d771e76 Fix the radix routed component to correctly handle connected tools - in such cases, the route must be direct to the tool.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-01 19:03:26 -07:00
Ralph Castain
64873487b4 Remove the max_connections parameter from the radix component as it is confusing. Modify PMIx client init so that it simply returns the nspace/rank if called by a server - this allows the server to retrieve its assigned ID. Register the server's nspace so client-side operations can succeed
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-11-01 12:17:11 -07:00
Joshua Hursey
ed5268a96a ras/slurm: Fix !orte_keep_fqdn_hostnames for Slurm
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:21:30 -05:00
Joshua Hursey
5a4c52d9cb ras/loadleveler: Fix !orte_keep_fqdn_hostnames for Loadleveler
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:21:30 -05:00
Joshua Hursey
8230201ad1 ras/gridengine: Fix !orte_keep_fqdn_hostnames for GridEngine
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:21:30 -05:00
Joshua Hursey
9643175e40 ras/tm: Fix !orte_keep_fqdn_hostnames for TORQUE
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:21:24 -05:00
Joshua Hursey
8d02a33639 ras/lsf: Fix !orte_keep_fqdn_hostnames for LSF
* By default, make sure that we are using the short hostnames and not
   the fully qualified hostnames when running under LSF.
 * Related to commit open-mpi/ompi@d26dd2c20e

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2016-11-01 13:04:52 -05:00
rhc54
6074c2a2a9 Merge pull request #2322 from rhc54/topic/routed
Update the routed components as we no longer need to init_routes.
2016-10-31 13:37:07 -07:00
Ralph Castain
b8c5d1ad88 Update the routed components as we no longer need to init_routes. Fixes case of direct launch via srun
Signed-off-by: Ralph Castain <rhc@open-mpi.org>

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-10-31 12:38:13 -07:00
Jeff Squyres
773d6039e7 Merge pull request #2306 from hjelmn/alps_cores
ras/alps: use cpuCnt if using hwthreads as cores
2016-10-31 15:22:13 -04:00
Gilles Gouaillardet
30298cc83c oob/tcp: remove debug that should have never been commited
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-31 16:41:14 +09:00
Gilles Gouaillardet
75e96004a4 oob/tcp: fix a typo in mca_oob_tcp_component_no_route()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-31 16:30:24 +09:00
Gilles Gouaillardet
fb5bcc47ce ess/singleton: use opal_setenv instead of putenv
so it fixes a memory leak on finalize

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-28 09:32:30 +09:00
Gilles Gouaillardet
ef2b3ac8d2 rml/oob: fix misc memory leaks in open_conduit()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-28 09:28:42 +09:00
Gilles Gouaillardet
831f7d9c9d rml/base: plug misc memory leaks
plug leaks in orte_rml_API_get_contact_info() and orte_rml_base_close()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-28 09:28:05 +09:00
Nathan Hjelm
c3614d30fa ras/alps: use cpuCnt if using hwthreads as cores
This commit updates the alps ras component to allow the use of
hyperthreads on compute nodes. In this case we need to use the cpuCnt
value from the node structure instead of numPEs.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-10-27 09:51:17 -06:00
Gilles Gouaillardet
3d4285b04d oob/tcp: silence valgrind warning
fully initialize allocated memory to keep valgrind happy

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-27 17:12:46 +09:00
rhc54
2b18044051 Merge pull request #2301 from rhc54/topic/update
Update PMIx to latest master tarball. Ensure we set the HNP name for …
2016-10-26 16:42:15 -07:00
Ralph Castain
f298f294e1 Update PMIx to latest master tarball. Ensure we set the HNP name for orted's so that PMIx_Lookup can find the server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-10-26 15:48:56 -07:00
Anandhi S Jayakumar
94593ca20b Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.
modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Fixed merge issues, and minor pull-request comments
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

Fixed merge issues, and minor pull-request comments
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Removed trailing space
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Cleaned up test- ofi_conduit_stress.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

cleaned up printing the provider info during initialisation
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Fixing warnings
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

minor cleanup
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

more cleanup
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Sending the ethernet address only in the get_contact_info, rest will be sent through modex
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Adding error logging on failures
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Handling the OPAL_MODEX_SEND/RECV generically for all ofi providers.
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Adding to build ofi for limited people
	new file:   ../orte/mca/rml/ofi/.opal_ignore
	new file:   ../orte/mca/rml/ofi/.opal_unignore

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Removign the error logging for now
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
2016-10-26 13:11:07 -07:00
Ralph Castain
d031946c46 When mpirun operates in --continuous mode, we won't terminate the job when a remote process dies. In that case, we have to activate both the waitpid _and_ the IOF complete states to ensure we properly mark the proc as dead and perform any required notifications
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-10-25 12:18:14 -07:00
Ralph Castain
227d4d9609 Open the conduits for application procs - we probably can remove all the
RML-related frameworks from MPI applications now, but let's wait a bit
to ensure we have cleaned up all the points where messaging might occur.
2016-10-24 16:53:19 -07:00
Ralph Castain
649301a3a2 Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier).
Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.
2016-10-23 21:52:39 -07:00
Ralph Castain
df8ac7b747 Properly mark a node as down and decrease the number of daemons so any
subsequent grpcomm collectives can correctly operate. Note that only the
direct grpcomm component knows how to deal with down nodes.
2016-10-21 09:53:37 -07:00
Gilles Gouaillardet
1846c2d8ad plm/rsh: use an alternate port if the ORTE_NODE_PORT attribute is set 2016-10-19 16:18:52 +09:00
Ralph Castain
16540c7422 Properly report failure to launch when someone mis-types the name of the application
Fixes #2233
2016-10-18 10:09:30 -07:00
Ralph Castain
7be607582e ORTE applications need to commit any modex send's prior to calling fence 2016-10-18 09:22:56 -07:00
Ralph Castain
57114a09ae Pickup the npernode and npersocket options and include them in the job object 2016-10-17 12:26:21 -07:00
Gilles Gouaillardet
bd1b6fe661 rml/oob: add a missing include file 2016-10-16 10:25:00 +09:00
Gilles Gouaillardet
451b9dc467 ess: tear down pmix (if any) before oob 2016-10-13 14:08:02 +09:00
Ralph Castain
fca1556787 Some compilers apparently complain about this, so modify the typedef statements 2016-10-12 08:44:03 -07:00
Ralph Castain
a2919174d0 Bring the RML modifications across. This is the first step in a revamp of the ORTE messaging subsystem to support fabric-based communications during launch and wireup phases. When completed, the grpcomm and plm frameworks will each have their own "conduit" for communication - each conduit corresponds to a particular RML messaging transport. This can be the active OOB-based component, or a provider from within the RML/OFI component. Messages sent down the conduit will flow across the associated transport.
Multiple conduits can exist at the same time, and can even point to the same base transport. Each conduit can have its own characteristics (e.g., flow control) based on the info keys provided to the "open_conduit" call. For ease during the transition period, the "legacy" RML interfaces remain as wrappers over the new conduit-based APIs using a default conduit opened during orte_init - this default conduit is tied to the OOB framework so that current behaviors are preserved. Once the transition has been completed, a one-time cleanup will be done to update all RML calls to the new APIs and the "legacy" interfaces will be deleted.

While we are at it: Remove oob/usock component to eliminate the TMPDIR length problem - get all working, including oob_stress
2016-10-11 16:01:02 -07:00
Gilles Gouaillardet
c92e9a5406 use the new OPAL_HASH_TABLE_FOREACH convenience macro 2016-10-08 16:58:20 +09:00
Gilles Gouaillardet
0931d09afa ess/singleton: silence a valgrind warning
initialize a pointer and keep valgrind happy about it
2016-09-27 15:22:39 +09:00
Gilles Gouaillardet
f9ebba4668 ess/singleton: only realloc() when required in fork_hnp() 2016-09-23 16:35:59 +09:00
Gilles Gouaillardet
c7bf9a0ec9 ess/singleton: fix read on the pipe to spawn'ed orted
and close the pipe on both ends when it is no more needed
2016-09-22 14:21:52 +09:00
Ralph Castain
de7b1494d9 Clean out old cruft from the ORCM project 2016-09-21 00:13:30 -07:00
Gilles Gouaillardet
83399adb3f singleton: "safe" read/write to the pipe between (spawn'ed) orted and singleton 2016-09-20 14:56:58 +09:00
Gilles Gouaillardet
e7ae6975d0 orted: fix spawn in singleton mode
in singleton mode, have the spawn'ed orted invoke orte_pre_condition_transports()
and send the transport key back to the singleton
2016-09-20 14:39:22 +09:00
Ralph Castain
a16b3cc33d Fix some minor complaints - missing "void" in function parameters 2016-09-15 15:18:42 -07:00
Ralph Castain
6f086189e6 Fix trivial typo 2016-09-15 13:10:55 -07:00
Gregory M. Kurtzer
16794cc260 Updates to support Singularity containers v2.2 2016-09-15 09:52:06 -07:00
Gilles Gouaillardet
11ebf3ab23 ess/singleton: when forking hnp, use the PMIX_NAMESPACE sent by the hnp
as the jobid
2016-09-15 13:57:23 +09:00
Gilles Gouaillardet
e84b35217f oob/tcp: plug a memory leak
as reported by Coverity with CID 1196711
2016-09-08 18:50:18 +09:00
Gilles Gouaillardet
b2a2be0e5a odls: fix memory leak plug
This fixes commit open-mpi/ompi@e2c343cdfc.
2016-09-08 10:02:52 +09:00
Artem Polyakov
9eba1b0b75 Merge pull request #2042 from artpol84/pmix_sdirs
Several fixes related to session directories:
2016-09-07 14:15:47 +07:00
Artem Polyakov
a9a7f39773 ess/pmi: fix the comments about MCA/PMIx setting conflict resolution. 2016-09-07 07:47:35 +03:00
Gilles Gouaillardet
e2c343cdfc odls: plus memory leak
as reported by Coverity with CID 710645
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
c09899f6af plm: plus resource leaks
as reported by Coverity with CIDs 72274 and 1196733
2016-09-07 10:08:44 +09:00
Josh Hursey
f6337f9eae Merge pull request #2047 from jjhursey/topic/mixed-host2
orte: !FQDN implementation to use opal_net_isaddr
2016-09-06 13:08:54 -05:00
Ralph Castain
f85dcaee2a Fixes CID 1369067 and CID 1196684
Fixes CID 1369648

    Fixes CID 1372409
2016-09-06 08:43:15 -07:00
Artem Polyakov
74a11d7832 Fix session dir cleanup code. 2016-09-05 07:53:55 +03:00
Artem Polyakov
dc0ab674de Add PMIx key to provide RM with ability to indicate that it will cleanup
session directories provided at through OPAL_PMIX_TMPDIR,
OPAL_PMIX_NSDIR, OPAL_PMIX_PROCDIR
2016-09-05 07:48:44 +03:00
Artem Polyakov
81195ab724 Several fixes related to session directories:
* enable OMPI to retrieve paths from RM through PMIx
* cleanups related to tempdirs.
2016-09-05 07:48:44 +03:00
Ralph Castain
fb51d65049 Minor change: check for NULL before using the job map to avoid segfault when erroring out prior to creating the map 2016-09-04 07:53:12 -07:00
Joshua Hursey
fe937d1e82 orte: !FQDN implementation to use opal_net_isaddr
* Switch to use opal_net_isaddr() for checking if a name is an IP
   address - as it is a bit cleaner, and uses common functionality.
2016-09-02 13:31:49 -05:00
Ralph Castain
4e0788e9ad Enable PSM to support dynamic processes
Fix comm_spawn to correctly reference the actual parent process that requested the spawn when looking for the parent job object
2016-09-02 10:22:04 -07:00
Ralph Castain
0ea1cff733 Implement notification of completion on comm_spawn'd child jobs. Add a configure flag to enable PMIx 3's shared memory datastore, and set it disable by default so that comm_spawn functions again. Will reverse the default once that feature is fully functional 2016-09-01 13:10:10 -07:00
Gilles Gouaillardet
0b8c58298d oob/usock: fix handling of orte_process_name_t *
orte_process_name_t is aligned on 32 bits, so it cannot simply be casted
into an int64_t. use memcpy() instead

Thanks Paul Hargrove for the report
2016-09-01 13:18:02 +09:00
Ralph Castain
c1050bc01e Provide a mechanism for obtaining memory profiles of daemons and application profiles for use in studying our memory footprint. Setting OMPI_MEMPROFILE=N causes mpirun to set a timer for N seconds. When the timer fires, mpirun will query each daemon in the job to report its own memory usage plus the average memory usage of its child processes. The Proportional Set Size (PSS) is used for this purpose. 2016-08-31 09:32:07 -07:00
Ralph Castain
9b991bd1f5 Ensure that the "running" state is correctly updated
It is possible that one or more procs could get thru PMIx_Init, and thus be marked as in state "registered", before all local procs have been started. If that happens, then we would report some of the procs in state "running", and the others in state "registered" - which means that the HNP would miss the "running" stage of the state machine.

Thanks to Jingchao Zhang for his patience in tracking this down on the 2.0 branch
2016-08-30 19:24:39 -07:00
Josh Hursey
b0d8638824 Merge pull request #2015 from jjhursey/topic/mixed-hostnames
orte: Expand use of !orte_keep_fqdn_hostnames MCA parameter
2016-08-29 09:14:54 -05:00
Ralph Castain
2f6e0fec90 Provide the number of nodes in the job 2016-08-26 14:50:41 -07:00
Joshua Hursey
d26dd2c20e orte: Expand the application of !orte_keep_fqdn_hostnames
* Expand the use of the `orte_keep_fqdn_hostnames` MCA parameter when
   it is set to false.
 * If that parameter is set to false (default) then short hostnames
   (e.g., `node01`) will match with the long hostnames (e.g.,
   `node01.mycluster.org`). This allows a user (or resource manager)
    to mix the use of short and long hostnames.
  - Note that this mechanism does _not_ perform a DNS lookup, but
    instead strips off the FQDN by truncating the hostname string at
    the first `.` character (when not an IP address).
     - By default (`false`) the following is true:
       `node01 == node01.mycluster.org == node01.bogus.com`
       since we use `node01` as the hostname.
2016-08-26 16:09:04 -05:00
Artem Polyakov
55ac3b0be3 orte/schizo: fix binding detection in slurm component
in SLURM 16.05 the SLURM_CPU_BIND_TYPE is equal to "mask_cpu:"
instead of "mask_cpu". Account for that.
2016-08-26 09:55:52 +03:00
rhc54
19b0f4db9f Merge pull request #1995 from rhc54/topic/pe-per-rank
Change the behavior of cpus-per-rank.
2016-08-25 14:38:12 -05:00
Ralph Castain
440eae90ec Correct the binding algorithm to decouple it from oversubscribe.
Oversubscribe stipulates that we allow more procs on the node than assigned slots - it has nothing to do with the number of available pe's. Let overload directives handle the pe situation.
2016-08-24 21:17:22 -07:00
Gilles Gouaillardet
93e73841f9 ess/singleton: push all PMIX_* environment variables, regardless how many there are 2016-08-23 09:46:55 +09:00
Gilles Gouaillardet
a1e8e58a8a ess/singleton: expects 4 PMIX_* environment variables or more 2016-08-23 09:34:03 +09:00
Ralph Castain
7de4d6922b Change the behavior of cpus-per-rank. We previously counted each cpu against the #slots. However, IBM has pointed out that "slot" is equated to the number of processes allowed to run on each node, and not the number of cpus on the node. This has been a continuing source of confusion, so make the distinction a "hard" one.
Each process occupies a "slot". We automatically set #slots = #cpus if nothing else is told to us. If you want to run more procs and slots, you must tell us to allow oversubscription.

A process can utilize multiple pe's if that option is given. If you try to bind more than one proc to a given pe, then we will error out unless you tell us to allow overloading.
2016-08-22 15:54:41 -07:00
Jeff Squyres
71ec5cfb43 rsh: robustify the check for plm_rsh_agent default value
Don't strcmp against the default value -- the default value may change
over time.  Instead, check to see if the MCA var source is not
DEFAULT.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-16 06:58:20 -05:00
rhc54
d7cd802426 Merge pull request #1971 from rhc54/topic/sesdir
Update the session dir structure. Restore the creation of a top-level…
2016-08-16 03:14:08 -05:00
Ralph Castain
ae2af61ee3 Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs. 2016-08-15 22:46:46 -05:00
Ralph Castain
9f43db7303 Further cleanup getpwuid usage - try it first (unless completely disabled), and then silently failover to try other methods. 2016-08-15 07:51:36 -07:00
Ralph Castain
be8424b691 Provide backward compatible keys so that the non-PMIx components in the opal/pmix framework don't have to adjust as we continue to work on finalizing the PMIx reference scheme. Activate and utilize the new PMIx show_help capability to provide more meaningful error output when the server cannot start.
Add a contrib script to cleanup permissions incorrectly modified due to things like smb mounts

dd
2016-08-13 12:13:04 -07:00
Ralph Castain
08a0644df5 Fix shared memory rendezvous 2016-08-13 08:14:50 -07:00
rhc54
ddde154d28 Merge pull request #1962 from rhc54/topic/notify
Ensure we properly convert pmix status to ORTE state before activatin…
2016-08-13 06:59:50 -07:00
Ralph Castain
48d35a9627 Ensure we properly convert pmix status to ORTE state before activating an error state upon notification. Cleanup some conversion issues on notification info. Add a new orte_notify.c test program 2016-08-12 21:14:29 -07:00
rhc54
9eed451916 Merge pull request #1960 from rhc54/topic/rsh
Restore the rsh template creation code
2016-08-12 13:38:43 -07:00
rhc54
1ef3c86d44 Merge pull request #1931 from hjelmn/ess_fix
ess/base: set up nidmap after pmix
2016-08-12 13:10:30 -07:00
Ralph Castain
5717b75b45 Restore the rsh template creation code 2016-08-12 12:43:40 -07:00
Ralph Castain
1c44543854 If the ssh agent hasn't been given, then check for qrsh and friends 2016-08-12 07:46:39 -07:00
Artem Polyakov
1351a7065c ess/pmi: minor code readablility cleanup.
Split process name variable "name" to
- "wildcard_rank" for the cases where wildcard is used.
- "pname" for the case where reference to particular process is needed.
2016-08-06 15:45:19 +06:00
Nathan Hjelm
3c23502dfe ess/base: set up nidmap after pmix
This fixes a SEGV when the nidmap code attempts to use
opal_pmix.store_local before pmix is set up.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-02 09:50:00 -06:00
Ralph Castain
71de03fc67 Cleanup the new naming requirements to ensure that info is correctly retrieved
Cleanup permissions

Restore singleton operations
2016-07-21 09:46:03 -07:00
Ralph Castain
01a653d50a Remove a debug print in comm_cid.c. Update PMIx2 to include the revised PMIx_Get logic for higher performance by reducing the number of hash table lookups. Fix a bug where requests for data from a proc in another nspace could hang, or result in "not found".
Remove stale file reference

Restore autogen pass thru pmix

Remove generated file
2016-07-20 00:58:19 -07:00
rhc54
2414244171 Merge pull request #1872 from rhc54/topic/continuous
Add support for continuously operating applications
2016-07-13 15:29:31 -07:00
Ralph Castain
20a91c2baf Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program.
Cleanup debug message
2016-07-13 15:28:33 -07:00
Ralph Castain
ddd0d05de3 Fix a bug in the handling of nper<foo> when -host or -hostfile was given. Correctly mark slots as "given" when we auto-assign them. Ensure we don't set the number of procs when using nper<foo> so the PPR mapper can correctly assing them. 2016-07-12 09:27:02 -07:00
Ralph Castain
ee56d9dc1a Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
Ralph Castain
5d330d5220 Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler.
Add PMIx 2.0

Remove PMIx 1.1.4

Cleanup copying of component

Add missing file

Touchup a typo in the Makefile.am

Update the pmix ext114 component

Minor cleanups and resync to master

Update to latest PMIx 2.x

Update to the PMIx event notification branch latest changes
2016-06-14 13:08:41 -07:00
Ralph Castain
a6e6c37484 Remove stale map-reduce support 2016-06-12 07:41:57 -07:00
Ralph Castain
dd0f843843 Fix rare hangs observed on OS-X by properly thread-shifting upcalls from the PMIx server into ORTE 2016-06-05 21:39:44 -07:00
Ralph Castain
0ba9572f9f Cleanup the forced termination a bit by restoring the delay before issuing the sigkill, and eliminating the large time loss spent checking if the proc died. The latter is responsible for a large number of test timeouts in MTT
Update alps component
2016-06-02 17:48:21 -07:00
Gilles Gouaillardet
5f565dfec3 configury: clean the flex generated .c files 2016-06-01 11:13:31 +09:00
Ralph Castain
3913595e10 Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster.
Note that this cannot be used for MPI performance testing. It is really only useful for ORTE scaling tests. It also only works with the rsh/ssh launcher.
2016-05-29 18:56:18 -07:00
Ralph Castain
ebe159acef Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests
If requested, obtain stacktraces for each application process and report it to stderr upon timeout

stack traces: minor improvements

- Also include the hostname and PID of the each process for which
  we're sending the stack traces (vs. just including the ORTE process
  name)
- Send a specific error message if we couldn't find "gstack" in the
  $PATH (e.g., on OS X)
- Send a sepcific error message if gstack fails to run
- Print a message that obtaining the stack traces may take a few
  seconds so that users don't wonder what's happening

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>

help-orterun.txt: minor tweaks

Trivial update: show "--timeout" (instead of "-timeout") in the help
message, just to encourage the use of double-dash options.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>

trivial: stacktrace -> stack trace

Trivial word smything.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-28 08:36:25 -07:00
Jeff Squyres
dd9a819a1c odls_default: do not opal_output() while creating a process!
It is verbotten to use opal_output() after the fork() but before the
exec()!  It results in all manner of undefined behavior.  For example,
on some OS X systems, if you run a trivial "hello world" MPI program
with a high level of ODLS verbosity:

```sh
$ mpirun -np 3 --mca odls_base_verbose 100 ./hello_c
```

You will see a bunch of output from the mpirun ODLS base, but then it
*may* hang in odls_default_module.c:do_child() -- after the fork() but
before the exec() -- while trying to opal_output() some debugging
statements.

The solution is to remove these extraneous opal_output() statements.
Indeed, the ODLS base is already outputting the same information that
these opal_output() statements are trying to emit, anyway.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-24 21:28:57 -04:00
Ralph Castain
30aaf785a8 Fix the dist mapper option 2016-05-23 23:20:33 -07:00
George Bosilca
50b37758d4 Don't overwrite the function argument.
In a MPMD setup the app in the jdata can be NULL, so make sure we
don't leave the main argument to an inconsistent value.
2016-05-19 10:35:23 -04:00
Ralph Castain
7e5ef6a240 Fix the env_list support - the MCA param was being set way too early, so provide a "backdoor" way of providing the value 2016-05-06 15:38:39 -07:00
Ralph Castain
58dd41facf Repair the processing of cmd line options that mapped to MCA params. This was responsible for breaking things like map-by <foo>.
Remove debug, let orterun send terminate cmd to DVM

Recover the DVM support
2016-05-06 13:14:03 -07:00
rhc54
ff8518853e Merge pull request #1604 from rhc54/topic/psm2
Improve the transport key print statement to ensure that we don't get…
2016-05-03 13:43:10 -07:00
Jeff Squyres
265e5b9795 Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1
ompi/opal/orte/oshmem/test: max hostname length cleanup
2016-05-02 09:44:18 -04:00
rhc54
2fa8b6c6ac Merge pull request #1525 from rhc54/topic/schizo
Extend the schizo framework
2016-05-01 15:09:08 -07:00
Ralph Castain
6ac7929bd0 Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need.
Cleanups per @jjhursey review
2016-05-01 11:30:25 -07:00
Ralph Castain
0f05893952 Ensure consistency between max_procs and univ_size values - since orte wants max_procs, have the proc get that value instead of univ_size
Make the singleton module consistent as well
2016-05-01 11:13:33 -07:00
Ralph Castain
29bc24bdd5 Improve the transport key print statement to ensure that we don't get zero fields as this can be a problem for PSM 2016-04-28 20:11:12 -07:00
Ralph Castain
e6ad1ad621 Up-port of change for 2.x: if user directs oversubscribe, then do not bind as we will otherwise overload resources 2016-04-28 13:21:10 -07:00
Ralph Castain
75dc4c305a Correctly set the #procs in the job to "job_size", and the max_procs to "univ_size" 2016-04-27 12:00:19 -07:00
Gilles Gouaillardet
6bf57c799f orte/rml: ORTE_RML_SEND_COMPLETE handles messages with both NULL iov and cbfunc.buffer 2016-04-26 09:19:31 +09:00
Karol Mroz
5c11bdb251 orte: fixup hostname max length usage
Also removes orte specific max hostname value.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-25 07:08:23 +02:00
Joshua Hursey
29b49351af ras/lsf: Fix affinity for MPMD jobs running under LSF 2016-04-22 11:18:34 -05:00
Jeff Squyres
68c1a5eb6c Merge pull request #1567 from jsquyres/pr/fix-ompi-to-opal-name-conversion
m4: rename OMPI_SUMMARY_* macros to OPAL_SUMMARY_*
2016-04-20 13:10:06 -04:00
Jeff Squyres
6800ef9ec0 m4: rename OMPI_SUMMARY_* macros to OPAL_SUMMARY_*
These macros should really be named OPAL_SUMMARY_*; they're used in
all projects, and therefore should be in the lowest later project (OPAL).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-04-20 08:40:00 -07:00
Ralph Castain
449ec41532 Roll to PMIx 1.1.4rc1 and remove the PMIx 1.2.0 directory as the community has decided to not do that release version. This incorporates a number of bug fixes that have been identified and repaired in the PMIx and OMPI code bases. Also includes several minor corrections to the PMIx code so it now supports run-thru without hanging on collectives involving a process that exits 2016-04-15 10:11:11 -07:00
Ralph Castain
1fa236b26c Ensure that we exit with a non-zero status when oversubscribe fails 2016-04-14 05:51:10 -07:00
Ralph Castain
437f5b4289 Fix map-by node and do-not-launch 2016-04-13 09:21:19 -07:00
Ralph Castain
2432daf065 Some minor cleanups of a memory leak and error output 2016-04-08 07:46:18 -07:00
Rainer Keller
52080a5736 As per the pull request to pmix/master:
https://github.com/pmix/master/pull/71

Have OMPI's current version of pmix120 nicely fail in case of
too long sun_path (longer than 108 or in case of OSX 103 chars).
And have OMPI return proper error messages with hints how to
amend.
2016-04-07 22:12:53 +02:00
rhc54
a95de6e8ef Merge pull request #1353 from rhc54/topic/host
Per the discussion on the telecon, change the -host behavior yet again
2016-04-04 10:30:36 -07:00
Gilles Gouaillardet
d757fbba5d oob/usock: drop message to be sent in process_send() 2016-04-04 16:04:54 +09:00
Gilles Gouaillardet
170734182b oob/usock: mca_oob_usock_peer_close() sets peer->sd = -1 after close()
so usock_peer_create_socket know it must re-create the socket
/* assuming it is ever supposed to occur */
also fix a typo (peer->sd >= 0) in usock_peer_create_socket
2016-04-04 16:02:05 +09:00
Ralph Castain
503e1274a9 Per the discussion on the telecon, change the -host behavior so we only run one instance if no slots were provided and the user didn't specify #procs to run. However, if no slots are given and the user does specify #procs, then let the number of slots default to the #found processing elements
Ensure the returned exit status is non-zero if we fail to map

If no -np is given, but either -host and/or -hostfile was given, then error out with a message telling the user that this combination is not supported.

If -np is given, and -host is given with only one instance of each host, then default the #slots to the detected #pe's and enforce oversubscription rules.

If -np is given, and -host is given with more than one instance of a given host, then set the #slots for that host to the number of times it was given and enforce oversubscription rules. Alternatively, the #slots can be specified via "-host foo:N". I therefore believe that row #7 on Jeff's spreadsheet is incorrect.

With that one correction, this now passes all the given use-cases on that spreadsheet.

Make things behave under unmanaged allocations more like their managed cousins - if the #slots is given, then no-np shall fill things up.

Fixes #1344
2016-03-29 11:21:57 -07:00
Ralph Castain
bd18d9c9d5 Ensure the compiler knows that a critical variable is volatile 2016-03-29 09:18:25 -07:00
Howard Pritchard
e7433fcb44 Merge pull request #1486 from hppritcha/topic/fix_wlm_detect_code
plm/alps: fix usage of cray wlm_detect methods
2016-03-26 13:22:50 -06:00
Ralph Castain
0e1350f5b7 Add missing header files 2016-03-25 09:06:51 -07:00
Ralph Castain
a3fea58d1c Minor cleanups to prior PR commit 2016-03-24 15:55:14 -07:00
rhc54
6756e19aa2 Merge pull request #1457 from anandhis/master
rml changes
2016-03-24 15:17:29 -07:00
rhc54
ba8c8700aa Merge pull request #1493 from rhc54/topic/sing
Update singularity support to track changes in upstream Singularity code
2016-03-24 15:16:38 -07:00
Ralph Castain
8c14df2328 Revert "Modify singularity support per patch from Greg Kurtzer"
This reverts commit open-mpi/ompi@f7257a8310.

Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable.

Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl
2016-03-24 11:27:18 -07:00
Ralph Castain
378d9cbb5e Extend the abort on non zero status flag to apply to processes which die as the result of signals. 2016-03-24 08:33:55 -07:00
Ralph Castain
cdd3dc99ca Correct the binding for the --map-by node case - we should still use our default binding algorithms 2016-03-23 09:55:24 -07:00
Ralph Castain
6e6bbfda91 Very minor typo 2016-03-23 08:31:47 -07:00
Howard Pritchard
69200e6229 plm/alps: fix usage of cray wlm_detect methods
Turns out there are some cases where the Cray
wlm_detect_get_active may return NULL, in which
case fallback to wlm_detect_get_default method
is suggested.  Make use of the fallback to
avoid segfaults under some circumstances in the
ALPS plm selection method.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-03-22 11:40:56 -07:00
Ralph Castain
c146c4969b Revert part of open-mpi/ompi@c1bbbb5e2f to restore the usock component, thus fixing show_help aggregation.
Fixes #1467

Restore debugger attach operations

Fixes #1225
2016-03-18 21:49:04 -07:00
Ralph Castain
2970becd6b Revert "Merge pull request #1451 from ggouaillardet/topic/orte_fork_wrapper_fullname"
This reverts commit efafd62d38, reversing
changes made to a93b849f13.
2016-03-18 07:18:36 -07:00
Gilles Gouaillardet
589924c4aa odls/base: use the full app name when using an orte fork agent 2016-03-14 11:18:21 +09:00
Anandhi S Jayakumar
a31292abc7 fixes to ud for removing qos channel 2016-03-10 18:03:17 -08:00
Ralph Castain
a4c8e8c28a Cleanup the proposed change:
* qos framework is moving to the scon layer and is no longer required in ORTE

* remove the rml/ftrm component as we now have multiple active components, and so the wrapper needs to be rethought

* no need for separating the "base" from "API" module definition. The two are identical

* move the "stub" functions into their own file for cleanliness

* general cleanup to meet coding standards

* cleanup some logic in the stubs
2016-03-10 13:14:17 -08:00
Jeff Squyres
48c650c47a configury: minor updates to config summary output 2016-03-10 13:02:52 -08:00
Anandhi S Jayakumar
0188c3cf81 Adding commit for multiple plugin loading support in RML 2016-03-09 18:13:48 -08:00
Ralph Castain
f7257a8310 Modify singularity support per patch from Greg Kurtzer 2016-03-09 07:52:11 -08:00
Ralph Castain
f3ae30ff39 Fix singletons yet again... 2016-03-08 10:33:35 -08:00
Ralph Castain
d72c1c72ff Do not push child processes into separate process groups so that any host RM can still "see" them, and ensure that any signal sent to the orted's themselves will be provided to all child processes. Forward all signals from mpirun to the child processes, removing the old MCA parameter required to turn that behavior "on". 2016-03-06 17:55:09 -08:00
Ralph Castain
4d0cc27eb7 Update the singularity support to match that of the latest singularity master. Remove the restriction on shared memory components by instructing singularity to not isolate the PID space. Add a new schizo API to allow setting up the original app_context. Ensure the container is installed prior to execution. 2016-03-05 21:47:42 -08:00
Ralph Castain
ce0a05d7d1 Minor cleanup - Singularity now has an internal check for installed, so we no longer need to do so. 2016-03-04 19:07:53 -08:00
Gilles Gouaillardet
80bdbfd9e7 add missing include file 2016-03-03 13:46:28 +09:00
Ralph Castain
4a55fba414 Fix registration of error handlers thru the pmix120 component. A thread-shift operation was hanging on the sync_event_base, which made it dependent on someone calling opal_progress. Unfortunately, a process in "sleep" or spinning outside the MPI library won't do that, and so we never complete errhandler registration. 2016-03-02 15:01:01 -08:00
Ralph Castain
1b81d90eaa Minor cleanups required for orte-dvm operation 2016-03-01 18:12:53 -08:00
Ralph Castain
c9f7bb6751 Add the include file to all the schizo components 2016-03-01 13:18:23 -08:00
Ralph Castain
625083fe18 Add include file 2016-03-01 13:04:20 -08:00