Ralph Castain
50ca9fb66b
Merge pull request #2893 from rhc54/topic/sim
...
Cleanup the ras simulator capability, and the relay route thru grpcomm
2017-02-01 16:17:40 -08:00
Ralph Castain
230d15f0d9
Cleanup the ras simulator capability, and the relay route thru grpcomm
...
direct. Don't resend wireup info if nothing has changed
Fix release of buffer
Correct the unpacking order
Fix the DVM - now minimized data transfer to it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-01 15:01:58 -08:00
Ralph Castain
8bf3ac828c
Correct the path to the ORTE data dir - allows master to be built with --no-ompi
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-01 07:30:18 -07:00
Howard Pritchard
e62fca896f
Merge pull request #2889 from hppritcha/topic/fix_ess_alps_makefie
...
ess/alps: fix problem in makefile
2017-02-01 05:46:51 -05:00
Howard Pritchard
db4039f565
ess/alps: fix problem in makefile
...
./autogen.pl --no-ompi doesn't work without this
fix when alps can be configured.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-01-31 21:56:16 -06:00
Josh Hursey
31faf0a950
Merge pull request #2861 from jjhursey/topic/ibm/master/orted-timeout-improv
...
orterun: Add parameter to control when we give up on stack traces
2017-01-31 10:25:57 -06:00
Ralph Castain
b59ae14a2a
Fix static port and partial allocation operations
...
Fix static port wireup by recording the TCP port mpirun is using and correctly passing the regex of hosts to the daemons. Do a better job of closing sockets on failed connection attempts. Correctly identify the remote host in the associated error message.
Fix partial allocation operations by not attempting to set #slots on nodes that were not used, and thus don't have a daemon or topology assigned to them
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-28 10:09:44 -08:00
Ralph Castain
c803af5d3d
Minor change to allow qrsh to tree spawn, if supported
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-27 16:34:08 -08:00
Ralph Castain
7c795f4416
If the HNP is going to request topology info, it cannot do so via a routed OOB message as the intervening daemons may not be ready. So disable routing until the VM is ready, and have daemons start routing as they receive the xcast launch msg (which includes the data they need to talk to their peers).
...
Do a little optimization and minimize recomputation of the routing plan.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-27 15:37:16 -08:00
Ralph Castain
d672fad849
Repair rsh/ssh tree spawn
...
Repair rsh/ssh tree spawn by unpacking and updating the nidmap in remote_spawn.
Add more specific error messages so the cause of a messaging problem is a little clearer. Remove some stale code. Ensure we stop trying to send a message after a few times.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-27 11:35:00 -08:00
Joshua Hursey
3c47432e3d
orterun: Add parameter to control when we give up on stack traces
...
* MCA option to control how long we wait for stack traces:
- orte_timeout_for_stack_trace INTEGER
Default: 30
Setting to <= 0 will cause it to wait forever
* Useful when gathering stack traces from large jobs which might take
a long time.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-27 09:16:35 -06:00
Josh Hursey
2e64bf42fb
Merge pull request #2810 from jjhursey/fix/ibm/stdiag-to-stdout
...
Extend options for stddiag routing
2017-01-26 14:29:16 -06:00
Nathan Hjelm
fe1c6bd881
Merge pull request #2840 from hjelmn/event_fix
...
verbs: remove extra event user increment/decrement operation
2017-01-26 07:30:24 -08:00
Ralph Castain
399de0738e
Cleanup launch
...
Given that we only set OOB contact info from inside of events, or before we begin threaded operations (e.g., in the ess), allow set_contact_info to directly update the oob/base framework globals.
Correct the nidmap regex decompression routine.
Ensure that rank=1 daemon always sends back its topology as this is the most common use-case.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 22:06:09 -08:00
Nathan Hjelm
9f28c0af39
verbs: remove extra event user increment/decrement operation
...
Since the oob and connections systems do not work the same way they
did in older versions of Open MPI these operations are no longer
necessary. At best they do nothing and at worst they hurt performance
by making us enter the event library more often in opal_progress().
Fixes #2839
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-25 18:37:06 -07:00
Ralph Castain
2f4e87eae9
Have rank=1 daemon always send its topology back as this is the most common use-case
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 09:33:11 -08:00
Jeff Squyres
230bbc597d
plm base: make sure to assign "node" early enough
...
Make sure to assign "node" before using it in ORTE_FLAG_SET.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-25 08:02:59 -08:00
Ralph Castain
184ccc8e91
Cleanup some code so it is clear that it is executing in an event. Ensure that peer event base is properly set on incoming connections
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-25 06:55:11 -08:00
Gilles Gouaillardet
ef10d3fd7b
orte: add missing include file
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:15:20 +09:00
Joshua Hursey
0e9a06d2c3
orte/iof: Add app stderr to stdout redirection at source
...
* Add an MCA parameter to combine stdout and stderr at the source
- `iof_base_redirect_app_stderr_to_stdout`
* Aids in user debugging when using libraries that mix stderr with stdout
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:23:48 -06:00
Joshua Hursey
dcd9801f7c
orte/iof: Add orte_map_stddiag_to_stdout option
...
* Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag`
to `stdout` instead of `stderr`.
* Add protection so that the user canot supply both:
- `orte_map_stddiag_to_stderr`
- `orte_map_stddiag_to_stdout`
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:22:59 -06:00
Ralph Castain
ef86707fbe
Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it.
...
Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 13:33:22 -08:00
Ralph Castain
0bfdc0057a
Extend the -host:N syntax to accept "*" or "auto" to indicate "auto-detect the #cpus and set #slots to that value"
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 10:21:01 -08:00
Ralph Castain
d3907dec98
Make master continue the -host behavior of prior releases: use of -host <foo> specifies a single slot. Requests to run more than one process will require either specifying slots using the "-host foo:N" syntax, or adding --oversubscribe to the cmd line.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 10:11:56 -08:00
Ralph Castain
4e9364b9a4
Merge pull request #2794 from rhc54/topic/regs
...
Next step in reducing launch time
2017-01-24 03:19:57 -08:00
Ralph Castain
86ab751c5e
Next step in reducing launch time: begin reducing the size of the launch message itself. Start by expressing the daemon map as a set of three regular expression strings. On an 8k cluster, this reduces the nidmap contribution from over 200kBytes to 21 bytes in size.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 19:54:47 -08:00
Gilles Gouaillardet
d54e54538a
orted/orted_submit: plug a memory leak
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:30 +09:00
Gilles Gouaillardet
f4dc7e4134
orted/orted_submit: plug misc memory leaks
...
- always invoke init_globals() before opal_cmd_line_parse(orte_cmd_line, ...)
- plug more leaks in init_globals()
- remove unused env_val and personalities fields from orte_cmd_options_t
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
9d6e0482a6
orte/data_server: plug a memory leak in orte_data_server()
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Gilles Gouaillardet
0bdc594b2e
rml/base: plug a memory leak in orte_rml_API_recv_cancel()
...
simply return when the orte event thread has gone
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:12:47 +09:00
Ralph Castain
a61f7bdb26
Merge pull request #2780 from rhc54/topic/conn
...
Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
2017-01-23 06:40:28 -08:00
Ralph Castain
e7b12913b4
Ensure we properly set the "shutting down" flag so connection drops by downstream peers are properly handled.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 04:00:24 -08:00
Nathan Hjelm
954a4b7be3
oob/base: fix num_threads registration type
...
This commit fixes a bug in the registration of the num_threads MCA
variable. The variable is of type int and was being registered as
a boolean.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-22 14:02:34 -07:00
Ralph Castain
ac4fcd3f97
Ensure that oob/base level data is always accessed in the oob/base event thread. Make debruijn the default routed component
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 10:33:32 -08:00
Ralph Castain
6560617c04
Fix comm_spawn and orte-dvm by resetting all used "node mapped" flags after building the child list
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-22 05:55:53 -08:00
Ralph Castain
639cdd4f9d
Add missing flag set to ensure nodes do not get double-added to job map.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 20:06:50 -08:00
Ralph Castain
be3ef77739
Improve packing efficiency by raising the initial buffer size and modifying the extension code. Flag if a job map has had its nodes added so we don't have to loop repeatedly to check it.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 14:03:19 -08:00
Ralph Castain
466cbd4d29
Rework the threading in oob/tcp so that daemons (including mpirun) use multiple progress threads to get messages out to their children, and so that the oob/base uses a separate one to setup sends. This allows the daemon cmd processor to execute in parallel with relay of messages, which significantly reduces launch times at scale
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-21 13:26:19 -08:00
Ralph Castain
cfce565ce9
Merge pull request #2763 from naughtont3/tjn-ortedvm-daemonize
...
dvm: add daemonize and set-sid options
2017-01-20 08:08:21 -08:00
Thomas Naughton
39d335a277
dvm: add daemonize and set-sid options
...
Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
2017-01-20 09:28:26 -05:00
Ralph Castain
668421b6ec
Compress the xcast message if bigger than a defined size to further improve launch performance at scale
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 22:08:02 -08:00
Ralph Castain
1f46e48b94
Have mpirun and orteds activate the oob/tcp progress thread by default, leaving a way to turn it off via MCA param. Provide a method by which the add_procs command can be processed in parallel with relaying the cmd message to the next daemons down the tree.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 18:52:58 -08:00
Ralph Castain
bb132f6d03
Merge pull request #2764 from rhc54/topic/dvm
...
If a tool sees the HNP it is attached to die (thereby losing connecti…
2017-01-19 15:39:30 -08:00
Ralph Castain
ca50b31de1
Merge pull request #2762 from rhc54/topic/oobfast
...
Speed-up the OOB/TCP communications by using writev instead of writing the header, and then separately write the body
2017-01-19 15:39:06 -08:00
Ralph Castain
63caeba84d
Merge pull request #2747 from rhc54/topic/topo
...
Try a different approach for scalably dealing with hetero clusters
2017-01-19 14:22:36 -08:00
Ralph Castain
19bb64cfb8
If a tool sees the HNP it is attached to die (thereby losing connection), then stop the event loop instead of going through the abort code path. This will allow the tool to cleanup before exiting
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 14:04:06 -08:00
Ralph Castain
e5f687f896
Speed-up the OOB/TCP communications by using writev instead of writing the header, and then separately write the body
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-19 13:03:44 -08:00
Mark Santcroos
656bdcfc54
Expose opal_set_using_threads and improve error message on missing ompi_info.
...
Signed-off-by: Mark Santcroos <mark.santcroos@rutgers.edu>
2017-01-19 07:57:58 -05:00
Ralph Castain
368684bd63
Revert e9bc293
and try a different approach for scalably dealing with hetero clusters. Have each orted send back its topo "signature". If mpirun detects that this signature has not been seen before, then ask for that daemon to send back its full topology description. This allows the system to only get the topology once for each unique topo in the cluster.
...
Cleanup a typo, and remove no longer needed MCA params for hetero nodes and hetero apps. Hetero nodes will always be automatically detected. We don't support a mix of 32 and 64 bit apps
Modify the orte_node_t to use orte_topology_t instead of hwloc_topology_t, updating all the places that use it. Ensure that we properly update topology when we see a different one on a compute node.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-18 10:22:15 -08:00
Ralph Castain
c8768e3dab
Merge pull request #2740 from rhc54/topic/hnp
...
Add an MCA param "hnp_on_smgmt_node"
2017-01-17 05:49:51 -08:00