1
1
openmpi/orte/util
Ralph Castain b44f8d4b28 Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail.
Use hwloc to obtain the cpuset for each process during mpi_init, and share that info in the modex. As it arrives, use a new opal_hwloc_base utility function to parse the value against the local proc's cpuset and determine where they overlap. Cache the value in the pmap object as it may be referenced multiple times.

Thus, the return value from orte_ess.proc_get_locality is a 16-bit bitmask that describes the resources being shared with you. This bitmask can be tested using the macros in opal/mca/paffinity/paffinity.h

Locality is available for all procs, whether launched via mpirun or directly with an external launcher such as slurm or aprun.

This commit was SVN r25331.
2011-10-19 20:18:14 +00:00
..
comm By popular demand the epoch code is now disabled by default. 2011-08-26 22:16:14 +00:00
dash_host * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with 2010-05-17 23:08:56 +00:00
hostfile patched the lex files to not issue the following compiler warning: 2011-10-10 18:13:04 +00:00
context_fns.c Hostname is not used in this function. 2010-07-21 11:07:28 +00:00
context_fns.h ... Delayed due to notifier commits earlier this day ... 2009-04-29 01:32:14 +00:00
error_strings.c Add some missing error strings. Update test to show silent errors 2011-08-08 04:21:02 +00:00
error_strings.h Clean up an error in r24371 - can't use a const parameter as target in asprintf as it changes the value of the address. 2011-02-14 19:29:09 +00:00
help-regex.txt Fix regular expression analyzer for slurmd - use a slurm-specific version 2011-07-13 22:49:56 +00:00
hnp_contact.c By popular demand the epoch code is now disabled by default. 2011-08-26 22:16:14 +00:00
hnp_contact.h As requested by Aurelien at the July design meeting - long time coming, but finally got around to it. 2008-12-10 17:10:39 +00:00
Makefile.am Revamp the errmgr framework to provide a greater range of optional behaviors, including different behaviors for daemons, and remove several looping messages across the code base: 2010-04-23 04:44:41 +00:00
name_fns.c By popular demand the epoch code is now disabled by default. 2011-08-26 22:16:14 +00:00
name_fns.h Each level (OPAL/ORTE/OMPI) should only return it's own constants, 2011-10-04 14:50:31 +00:00
nidmap.c Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail. 2011-10-19 20:18:14 +00:00
nidmap.h Per the RFC from Jeff, move hwloc from opal/mca/common to its own static framework ala libevent. Have ORTE daemons collect the topology info at startup and, if --enable-hwloc-xml is set, send that info back to the HNP for later use. The HNP only retains unique topology "templates" to reduce memory footprint. Have the daemon include the local topology info in the nidmap buffer sent to each app so the apps don't all hammer the local system to discover it for themselves. 2011-09-11 19:02:24 +00:00
parse_options.c Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces. 2011-04-29 18:46:40 +00:00
parse_options.h Use ports as multicast channels instead of networks so we avoid stepping into reserved spaces. 2011-04-29 18:46:40 +00:00
pre_condition_transports.c Missed one file from the last commit. 2011-04-29 14:44:02 +00:00
pre_condition_transports.h In the case of direct-launched processes running under slurm, psm requires that the pre_condition_transports MCA param be set. This is normally computed by mpirun and inserted into each proc's environ, but that doesn't work here. 2011-04-28 13:54:33 +00:00
proc_info.c Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail. 2011-10-19 20:18:14 +00:00
proc_info.h Complete implementation of the ess.proc_get_locality API. Up to this point, the API was only capable of telling if the specified proc was sharing a node with you. However, the returned value was capable of telling you much more detailed info - e.g., if the proc shares a socket, a cache, or numa node. We just didn't have the data to provide that detail. 2011-10-19 20:18:14 +00:00
regex.c Cleanup handling of all-numerical node names 2011-08-05 14:59:14 +00:00
regex.h Fix some major bit-rot on scalable launch. If static ports are provided, then daemons can connect back to the HNP via the routed connection tree instead of doing so directly. In order to do that at scale, the node list must be passed as a regular expression - otherwise, the orted command line gets too long. 2011-07-07 18:54:30 +00:00
session_dir.c Fix a few valgrind-reported memory leaks 2011-03-08 17:37:28 +00:00
session_dir.h - Revert r20739 2009-03-05 21:56:03 +00:00
show_help.c Fix a few issues with error messages: 2011-03-07 16:45:45 +00:00
show_help.h Fix a few issues with error messages: 2011-03-07 16:45:45 +00:00