1
1
Граф коммитов

18377 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
f85dca0285 Fix spelling mistake that has been there for a long, long time...
This commit was SVN r28562.
2013-05-24 22:28:15 +00:00
Jeff Squyres
e339aa7561 Add missing svn:ignore on new mindist component
This commit was SVN r28561.
2013-05-24 15:58:47 +00:00
Rolf vandeVaart
3d1d158a80 Do not abort in BTL. Rather, callback into PML error function. Thanks George for review.
This commit was SVN r28559.
2013-05-23 18:45:23 +00:00
Brian Barrett
d4c33af0cd Update news to match the v1.7 branch
This commit was SVN r28558.
2013-05-23 17:56:12 +00:00
Rolf vandeVaart
7771857991 Adjust how cuda.h is found. It can be found in the with-cuda dir now.
This commit was SVN r28555.
2013-05-22 22:04:46 +00:00
George Bosilca
a9aae9c538 Patch based on Takahiro Kawashima fixing the issues with some
of the Fortran datatypes. This patch prevent the copy of the
datatype description from the OPAL to the OMPI layer in order
to decrease the memory requirements.

This commit was SVN r28553.
2013-05-22 18:35:21 +00:00
Jeff Squyres
6d173af329 This commit introduces a new "mindist" ORTE RMAPS mapper, as well as
some relevant updates/new functionality in the opal/mca/hwloc and
orte/mca/rmaps bases.  This work was mainly developed by Mellanox,
with a bunch of advice from Ralph Castain, and some minor advice from
Brice Goglin and Jeff Squyres.

Even though this is mainly Mellanox's work, Jeff is committing only
for logistical reasons (he holds the hg+svn combo tree, and can
therefore commit it directly back to SVN).

-----

Implemented distance-based mapping algorithm as a new "mindist"
component in the rmaps framework.  It allows mapping processes by NUMA
due to PCI locality information as reported by the BIOS - from the
closest to device to furthest.

To use this algorithm, specify:

   {{{mpirun --map-by dist:<device_name>}}}

where <device_name> can be mlx5_0, ib0, etc.

There are two modes provided:

 1. bynode: load-balancing across nodes
 1. byslot: go through slots sequentially (i.e., the first nodes are
     more loaded)

These options are regulated by the optional ''span'' modifier; the
command line parameter looks like:

    {{{mpirun --map-by dist:<device_name>,span}}}

So, for example, if there are 2 nodes, each with 8 cores, and we'd
like to run 10 processes, the mindist algorithm will place 8 processes
to the first node and 2 to the second by default. But if you want to
place 5 processes to each node, you can add a span modifier in your
command line to do that.

If there are two NUMA nodes on the node, each with 4 cores, and we run
6 processes, the mindist algorithm will try to find the NUMA closest
to the specified device, and if successful, it will place 4 processes
on that NUMA but leaving the remaining two to the next NUMA node.

You can also specify the number of cpus per MPI process. This option
is handled so that we map as many processes to the closest NUMA as we
can (number of available processors at the NUMA divided by number of
cpus per rank) and then go on with the next closest NUMA.

The default binding option for this mapping is bind-to-numa. It works
if you don't specify any binding policy. But if you specified binding
level that was "lower" than NUMA (i.e hwthread, core, socket) it would
bind to whatever level you specify.

This commit was SVN r28552.
2013-05-22 13:04:40 +00:00
Jeff Squyres
55382c1bf8 Bring over upstream hwloc trunk commit
https://svn.open-mpi.org/trac/hwloc/changeset/5592 to fix the merging
of groups when they are I/O objects.

This commit was SVN r28551.
2013-05-22 12:34:59 +00:00
Jeff Squyres
43a534a5c6 Revert r28544: the original code was fine.
This commit was SVN r28549.

The following SVN revision numbers were found above:
  r28544 --> open-mpi/ompi@c830d96673
2013-05-21 16:06:08 +00:00
Jeff Squyres
c830d96673 Silence compiler warnings, as suggested by Alan Sayre.
This commit was SVN r28544.
2013-05-21 13:42:18 +00:00
Jeff Squyres
52f4a10c21 Gang together with r28542: I missed a secondary location where .git
needed to be mentioned.

This commit was SVN r28543.

The following SVN revision numbers were found above:
  r28542 --> open-mpi/ompi@772284b89a
2013-05-20 17:11:53 +00:00
Jeff Squyres
772284b89a Clarify that we do debug builds for .svn, .hg, and .git.
This commit was SVN r28542.
2013-05-20 17:10:30 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00
Ralph Castain
889bf60c64 Fix bad merge
This commit was SVN r28540.
2013-05-18 01:29:55 +00:00
Jeff Squyres
089c632cce Remove a bunch of dead code: gcc 4.7 warns of set-but-unused
variables.  So get rid of them.

This commit was SVN r28538.
2013-05-17 21:45:49 +00:00
Edgar Gabriel
1b1051da6c fix a bug in the calculation of the explicit offset. Use the opportunity to
clean up the code a bit.

This commit was SVN r28537.
2013-05-17 20:22:00 +00:00
Jeff Squyres
f43fde5394 Gah! How did we end up without a greek string? :-(
This commit was SVN r28536.
2013-05-17 13:25:43 +00:00
George Bosilca
4d9f30fb05 Fix issue identified by Takahiro Kawashima regarding the overwriting
of the OPAL datatype descriptions upon MPI_Init. Now each layer (OPAL
and OMPI) uses it's own descriptions for the predefined datatypes,
thus preventing over-writing of the other layer data description.

This commit was SVN r28535.
2013-05-17 13:09:16 +00:00
Ralph Castain
e100b8d165 don't need the return value, but should check for error
This commit was SVN r28534.
2013-05-16 15:15:02 +00:00
Ralph Castain
3e6e1046a3 fix a correctness issue by returning an error if waitall fails and invoking the mpi error handler
cmr:v1.7.2:reviewer=jsquyres

This commit was SVN r28533.
2013-05-16 15:04:37 +00:00
Jeff Squyres
128cc27417 Minor type fix (they're both enums/ints, so the compiler previously
silently cast them).

This commit was SVN r28532.
2013-05-16 00:47:37 +00:00
Ralph Castain
93ba4247f8 remove extra paren when --without-hwloc
This commit was SVN r28530.
2013-05-15 21:31:45 +00:00
Ralph Castain
3a372a65b8 Mapping policies must be tested as equalities as they are values, not bitmasks
This commit was SVN r28526.
2013-05-15 13:45:00 +00:00
Ralph Castain
29e4b0cc50 Cannot test equality on mapping directives as it is a bitmask
This commit was SVN r28525.
2013-05-15 13:41:49 +00:00
Matthias Jurenz
ef0a080028 Changes to VT:
- fixed compiler warnings when compiling for 32-bit
	- MPI wrapper generator scripts:
		- removed non-posix call to length(array)
		- exit scripts if any statement returns a non-true return value (set -e)

This commit was SVN r28524.
2013-05-15 10:44:51 +00:00
Ralph Castain
1ec13d530c Allow simple way to request comparison to full address regardless of addr family
This commit was SVN r28519.
2013-05-14 22:08:39 +00:00
Ralph Castain
eb2edb4b2b Silence warning
This commit was SVN r28516.
2013-05-14 22:00:01 +00:00
Ralph Castain
04b11accd3 Silience a few warnings
This commit was SVN r28515.
2013-05-14 21:58:40 +00:00
Rolf vandeVaart
8a8ea9ba1b Fix compile error in optimize build for CUDA-aware code.
This commit was SVN r28512.
2013-05-14 21:07:27 +00:00
Rolf vandeVaart
91fdb423d7 Fix warning in CUDA-aware code.
This commit was SVN r28511.
2013-05-14 21:04:15 +00:00
Rolf vandeVaart
52ebb0b17f Change some opal_output to OPAL_OUTPUT per CMR review.
This commit was SVN r28510.
2013-05-14 20:49:42 +00:00
Ralph Castain
5296099ecb Fix the cpus-per-rank when binding to hwthreads. Add cpus-per-rank to diag printout
Thanks to Elena for reporting the problem

This commit was SVN r28508.
2013-05-14 20:17:50 +00:00
Nathan Hjelm
32a8ff5255 btl/openib: bump up udcm priority
This commit was SVN r28505.
2013-05-14 20:02:40 +00:00
Edgar Gabriel
d5cae9aced - fix the mca stripe size and stripe depth parameter logic in the pvfs2
component
- correctly recognize and handle the corresponding info objects.

This commit was SVN r28497.
2013-05-14 16:11:39 +00:00
Ralph Castain
37088f23d8 When ipv6 disabled, we still have getaddrinfo, so use it when checking common networks for resolving to kindex
This commit was SVN r28496.
2013-05-14 15:54:46 +00:00
Matthias Jurenz
9a0432632a Changes to VT:
- general:
		- incremented version number to 5.14.4
		- fixed Coverity CIDs: 72002, 72099, 72273, 710580, 710664, 710665, 710666
	- VT libs:
		- fixed "incompatible declaration" errors when building against an MPI-3 implementation
		  Since MPI-3 the C keyword "const" is added to all relevant MPI API parameters
		  (e.g. MPI_Send(void* sendbuf, ...) -> MPI_Send(const void* sendbuf, ...)).
		  Prepending the macro CONST to these parameters which is defined either to "const" (if MPI-3) or to nothing (if MPI-1/2).
		- fixed potential buffer overflow when reading the filter file
		- CUDA tracing:
			- enabled access to CUPTI counters for CUDA tracing via CUPTI
			- enabled GPU memory usage tracing independent of the CUDA API
			- enabled recording of CUDA synchronization and implicit synchronization in blocking CUDA memory copies for CUDA tracing via CUPTI
			- enabled recording of synchronous peer-to-peer CUDA memory copies for CUDA tracing via CUPTI
			- consider CUDA data transfers as not idle for option 'pure_idle'
			- fixed identification of the CUDA device ID for CUDA tracing via CUPTI
			- fixed region filtering for applications using the CUDA runtime API wrapper
	- compiler wrappers: add path to mpi.h to the PDT parser command and preprocessor flags

This commit was SVN r28494.
2013-05-14 14:28:04 +00:00
Ralph Castain
3fc1bafd82 fix typo
This commit was SVN r28490.
2013-05-14 12:36:45 +00:00
Yossi Etigin
64d98e0438 Fix data corruption in MXM by registering to OPAL memory release hooks and removing any mappings created by mxm
This commit was SVN r28489.
2013-05-14 12:27:44 +00:00
Ralph Castain
f4f07bdb21 Ensure the opal_ifaddrtokindex function considers the full range of address space by using the netmask
This commit was SVN r28487.
2013-05-14 03:37:44 +00:00
Ralph Castain
c33219a51b Extend the bitmap API a bit to provide a test if all bits zero
This commit was SVN r28486.
2013-05-14 03:34:57 +00:00
Ralph Castain
427b6b0b47 Fix the verbosity of yet another framework...sigh.
This commit was SVN r28481.
2013-05-13 14:36:32 +00:00
Jeff Squyres
3717307fb6 Add bullet about wrapper rpath behavior.
This commit was SVN r28480.
2013-05-11 00:49:44 +00:00
Jeff Squyres
4d9da92e60 Fixes trac:376: bu default the wrappr compilers will enable rpath support
in generated executables on systems that support it.  Use
--disable-wrapper-rpath to disable this behavior.  See text in
README about --disable-wrapper-rpath for more details.

This commit was SVN r28479.

The following Trac tickets were found above:
  Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
2013-05-11 00:49:17 +00:00
Rolf vandeVaart
9d569f1487 Fix warning when compiling in CUDA aware code.
This commit was SVN r28476.
2013-05-10 21:29:08 +00:00
Jeff Squyres
456df1c9f7 Remove redundant opal_output() messages from the module; the called
functions will now show_help() their own error messages if something
goes wrong (per r28470).

This commit was SVN r28471.

The following SVN revision numbers were found above:
  r28470 --> open-mpi/ompi@2ff95a7739
2013-05-10 15:12:07 +00:00
Jeff Squyres
2ff95a7739 Proper show_help error messages for LAMA.
This commit was SVN r28470.
2013-05-10 15:06:25 +00:00
Ralph Castain
353c77e659 Re-enable udcm for test purposes - appears to be working, but needs broader exposure to MTT
This commit was SVN r28468.
2013-05-09 16:10:29 +00:00
Tom Naughton
26acf8adb1 + revert part of changeset r28456 that slighly modified behavior of
".ompi_ignore" to ignore entire framework and avoided adding "ignored"
  frameworks to the autogenerated "frameworks.h" header file.

  This change restores previous behavior. 

This commit was SVN r28466.

The following SVN revision numbers were found above:
  r28456 --> open-mpi/ompi@0a950009be
2013-05-08 18:10:36 +00:00
Jeff Squyres
cad1d920b2 Check to ensure that we have struct ifreq.ifr_mtu before we try to use
it, because Solaris although has SIOCFIGMTU, it curiously does not
have ifreq.ifr_mtu.

This commit was SVN r28460.
2013-05-07 13:51:50 +00:00
Jeff Squyres
4b9b3a81ff Update the list of post-1.5.2 r numbers from hwloc that we have
committed here.

This commit was SVN r28458.
2013-05-07 01:22:06 +00:00