1
1
Brice Goglin ea80a20e10 hwloc/base: fix opal proc locality wrt to NUMA nodes on hwloc 2.0
Both opal_hwloc_base_get_relative_locality() and _get_locality_string()
iterate over hwloc levels to build the proc locality information.
Unfortunately, NUMA nodes are not in those normal levels anymore since 2.0.
We have to explicitly look a the special NUMA level to get that locality info.

I am factorizing the core of the iterations inside dedicated "_by_depth"
functions and calling them again for the NUMA level at the end of the loops.

Thanks to Hatem Elshazly for reporting the NUMA communicator split failure
at https://www.mail-archive.com/users@lists.open-mpi.org/msg33589.html

It looks like only the opal_hwloc_base_get_locality_string() part is needed
to fix that split, but there's no reason not to fix get_relative_locality()
as well.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2019-11-27 12:41:33 +01:00
..
2019-11-06 10:42:36 +01:00
2015-06-23 20:59:57 -07:00

12 Sep 2011

Notes for hwloc component maintainers:

1. There can only be *1* hwloc version component at a time.
   Specifically: if there are multiple hwlocXYZ components (i.e.,
   different versions of hwloc), then they must all be .ompi_ignore'd
   except for 1.  This is because we currently m4_include all of the
   underlying hwloc's .m4 files -- if there are multiple hwlocXYZ
   components, I don't know if m4 will barf at the multiple,
   conflicting AC_DEFUNs, or whether it'll just do something
   completely undefined.

1a. As a consequence, if you're adding a new hwloc version component,
   you'll need to .ompi_ignore all others while you're testing the new
   one.

2. If someone wants to fix #1 someday, we might be able to do what we
   do for libevent: OPAL_CONFIG_SUBDIR (instead of slurping in hwloc's
   .m4 files).