1
1

11 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
8b778903d8 Fix longstanding issue with our multi-project support. Rather than using
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi.  This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.

This commit was SVN r30140.
2014-01-07 22:11:15 +00:00
Ralph Castain
0e81959aae Cleanup mindist error messages - already patched in 1.7
This commit was SVN r29869.
2013-12-12 15:30:29 +00:00
Mike Dubman
c208b858e7 improve error messages in mindist
cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r29846.
2013-12-09 06:34:38 +00:00
Mike Dubman
840e2cb4a2 mindist: cosmetic, use fallback to byslot if unable to read NUMA info, small fix.
fixed by Elena, reviewed by Ralph/Mike
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29679.
2013-11-13 09:26:40 +00:00
Joshua Ladd
d594ffbfc7 Backing out Elena's patch - abstraction violation
This commit was SVN r29645.
2013-11-08 13:12:07 +00:00
Joshua Ladd
da3e272fdd Adds a check in the mindist mapper for whether or not the user asks for a specific device. This patch was submited by Elena Elkina and reviewed by Josh Ladd and should be added to
cmr=v1.7.4:reviewer=jladd

This commit was SVN r29644.
2013-11-08 04:28:53 +00:00
Ralph Castain
7a7cfdd519 A little cleanup - the base function to sort numa lists must return something or you get a warning about non-void function returning without value, so cleanup the return values. Ensure the mindist module actually checks for a return of "error" so it won't segfault, and have it emit a polite message when that happens.
cmr:v1.7.3:reviewer=jladd

This commit was SVN r29089.
2013-08-29 20:01:06 +00:00
Joshua Ladd
1802aabf1a Add support for autodetecting a MLNX HCA in the rmaps min distance feature. In this way, .ini files distributed with software stacks need not specify a particular HCA but instead may select the key word auto which will automatically select the discovered device. To use this feature, simply pass the keyword auto instead of a specific device name, --mca rmaps_base_dist_hca auto. If more than one card is installed, the mapper will inform the user of this and, at this point, the user will then need to specify which card via the normal route, e.g. --mca rmaps_base_dist_hca <dev_name>. This should be added to \ncmr=v1.7.4:reviewer=rhc:subject=Autodetect logic for min dist mapping
This commit was SVN r29079.
2013-08-28 16:23:33 +00:00
Ralph Castain
b2d86e1857 Silence uninitialized var warning
This commit was SVN r29034.
2013-08-16 21:35:51 +00:00
Joshua Ladd
61ffb47573 Minor fix for the min-dist mapping algorithm: we need to call 'get_nbobjs_by_type' first, before we get the sorted list of nodes - we need to add node objects and fill them in the summary object for the current topology. This patch was submitted by Elena Elkina and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28578.
2013-05-31 15:19:59 +00:00
Jeff Squyres
6d173af329 This commit introduces a new "mindist" ORTE RMAPS mapper, as well as
some relevant updates/new functionality in the opal/mca/hwloc and
orte/mca/rmaps bases.  This work was mainly developed by Mellanox,
with a bunch of advice from Ralph Castain, and some minor advice from
Brice Goglin and Jeff Squyres.

Even though this is mainly Mellanox's work, Jeff is committing only
for logistical reasons (he holds the hg+svn combo tree, and can
therefore commit it directly back to SVN).

-----

Implemented distance-based mapping algorithm as a new "mindist"
component in the rmaps framework.  It allows mapping processes by NUMA
due to PCI locality information as reported by the BIOS - from the
closest to device to furthest.

To use this algorithm, specify:

   {{{mpirun --map-by dist:<device_name>}}}

where <device_name> can be mlx5_0, ib0, etc.

There are two modes provided:

 1. bynode: load-balancing across nodes
 1. byslot: go through slots sequentially (i.e., the first nodes are
     more loaded)

These options are regulated by the optional ''span'' modifier; the
command line parameter looks like:

    {{{mpirun --map-by dist:<device_name>,span}}}

So, for example, if there are 2 nodes, each with 8 cores, and we'd
like to run 10 processes, the mindist algorithm will place 8 processes
to the first node and 2 to the second by default. But if you want to
place 5 processes to each node, you can add a span modifier in your
command line to do that.

If there are two NUMA nodes on the node, each with 4 cores, and we run
6 processes, the mindist algorithm will try to find the NUMA closest
to the specified device, and if successful, it will place 4 processes
on that NUMA but leaving the remaining two to the next NUMA node.

You can also specify the number of cpus per MPI process. This option
is handled so that we map as many processes to the closest NUMA as we
can (number of available processors at the NUMA divided by number of
cpus per rank) and then go on with the next closest NUMA.

The default binding option for this mapping is bind-to-numa. It works
if you don't specify any binding policy. But if you specified binding
level that was "lower" than NUMA (i.e hwthread, core, socket) it would
bind to whatever level you specify.

This commit was SVN r28552.
2013-05-22 13:04:40 +00:00