1
1
openmpi/orte/util
Ralph Castain 07655e2945 Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location.
So provide a new parameter (can't have too many!) that handles this situation by stripping the prefix from the returned node name. Also do a little cleanup to ensure we cleanly exit from errors, without generating too many annoying messages.

This commit was SVN r25562.
2011-12-02 14:10:08 +00:00
..
comm This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
dash_host This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
hostfile At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: 2011-11-15 03:40:11 +00:00
context_fns.c Hostname is not used in this function. 2010-07-21 11:07:28 +00:00
context_fns.h ... Delayed due to notifier commits earlier this day ... 2009-04-29 01:32:14 +00:00
error_strings.c Handle the case where the allocator "fibs" to us about the node names. In some cases (ahem...you know who you are!), the allocator will tell us a node number (e.g., "16"). However, the daemon will return a node name (e.g., "nid0016") - leaving us not recognizing its location. 2011-12-02 14:10:08 +00:00
error_strings.h Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address. 2011-02-14 19:29:09 +00:00
help-regex.txt Fix regular expression analyzer for slurmd - use a slurm-specific version 2011-07-13 22:49:56 +00:00
hnp_contact.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
hnp_contact.h As requested by Aurelien at the July design meeting - long time coming, but finally got around to it. 2008-12-10 17:10:39 +00:00
Makefile.am Revamp the errmgr framework to provide a greater range of optional behaviors, including different behaviors for daemons, and remove several looping messages across the code base: 2010-04-23 04:44:41 +00:00
name_fns.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
name_fns.h Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
nidmap.c Silence a few icc warnings and about mixing enums with other types. 2011-12-02 13:18:54 +00:00
nidmap.h Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves. 2011-09-11 19:02:24 +00:00
parse_options.c Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces. 2011-04-29 18:46:40 +00:00
parse_options.h Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces. 2011-04-29 18:46:40 +00:00
pre_condition_transports.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
pre_condition_transports.h In the case of direct-launched processes running under slurm, psm requires that the pre_condition_transports MCA param be set. This is normally computed by mpirun and inserted into each proc's environ, but that doesn't work here. 2011-04-28 13:54:33 +00:00
proc_info.c Gah! r25545 acidentally included ''waaaay'' more stuff than it was 2011-11-29 23:24:52 +00:00
proc_info.h At long last, the fabled revision to the affinity system has arrived. A more detailed explanation of how this all works will be presented here: 2011-11-15 03:40:11 +00:00
regex.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
regex.h Fix some major bit-rot on scalable launch. If static ports are provided, then daemons can connect back to the HNP via the routed connection tree instead of doing so directly. In order to do that at scale, the node list must be passed as a regular expression - otherwise, the orted command line gets too long. 2011-07-07 18:54:30 +00:00
session_dir.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
session_dir.h - Revert r20739 2009-03-05 21:56:03 +00:00
show_help.c This is actually a much smaller commit than it appears at first glance - it just touches a lot of files. The --without-rte-support configuration option has never really been implemented completely. The option caused various objects not to be defined and conditionally compiled some base functions, but did nothing to prevent build of the component libraries. Unfortunately, since many of those components use objects covered by the option, it caused builds to break if those components were allowed to build. 2011-11-22 21:24:35 +00:00
show_help.h Fix a few issues with error messages: 2011-03-07 16:45:45 +00:00