1
1
Граф коммитов

18295 Коммитов

Автор SHA1 Сообщение Дата
Rolf vandeVaart
62ab008017 Fix SEGV because missing CUDA initialization.
This commit was SVN r28601.
2013-06-07 18:31:36 +00:00
Rolf vandeVaart
1230029aa1 The debug messages were swapped. Fixed.
This commit was SVN r28600.
2013-06-07 17:23:41 +00:00
Vishwanath Venkatesan
0b727f84da Avoid malloc of zero bytes, add a check and avoid it.
This commit was SVN r28597.
2013-06-06 14:08:57 +00:00
Edgar Gabriel
2d4655a05a Logic has been revised compared to the previous implementation.
This commit was SVN r28594.
2013-06-05 23:47:42 +00:00
Edgar Gabriel
03c1db7a3a fix the calculation of the UNIFORM flag.
This commit was SVN r28593.
2013-06-05 23:18:50 +00:00
Vishwanath Venkatesan
7d6a05982a Removing the gather_array based on the flag UNIFORM FVIEW for read all operations (dynamic/static),
+ Disabling Timing data extraction by default in dynamic write all

This commit was SVN r28592.
2013-06-05 21:35:37 +00:00
Vishwanath Venkatesan
55878674d7 1. Removing the allgather_array based on the flag UNIFORM FVIEW. This is not really and optimization.
2. Fixing some of the debug printf's these are outdated.

This commit was SVN r28591.
2013-06-05 21:30:15 +00:00
Jeff Squyres
713e3aa3db Refs trac:3626: that ticket specifically refers to the v1.6 branch; this
commit is the trunk version of what is needed for #3626.

Add the "ignore_device" field to the INI file.  This allows us to
specifically list devices that should be ignored by the openib BTL
(such as the Intel Phi, at least as of May 2013 -- see #3626).  

Also add the Intel Phi to the ini file, and set its ignore_device=1.

Finally, add the concept of counting intentionally ignored verbs
devices.  Devices are ignored for one of two reasons:

 * If the number of allowed ports on that device is 0 (i.e., if
   if_include/if_exclude was set such that we're intentionally
   ignoring this device).
 * If the INI ignore_device field for this device is set to 1.

Once we have the count of devices that were intentionally ignored,
only show the "Hey, there's verbs devices that you're not using!"
show_help message if there are devices that were ''unintentionally''
ignored.

This commit was SVN r28589.

The following Trac tickets were found above:
  Ticket 3626 --> https://svn.open-mpi.org/trac/ompi/ticket/3626
2013-06-05 12:12:09 +00:00
Jeff Squyres
3019b7a3f8 Oops! Remove duplicate registration.
This commit was SVN r28588.
2013-06-05 11:55:19 +00:00
Jeff Squyres
1de00b17ad Properly check the return status from registering the MCA params.
This commit was SVN r28587.
2013-06-05 11:53:18 +00:00
Nathan Hjelm
e48bd9809e Add useful messages for MPI_T error codes
This commit was SVN r28584.
2013-06-04 23:18:44 +00:00
Jeff Squyres
d692aba672 Remove the DR PML. It was abondoned long ago. It had a nice life,
a few papers, and now a decent demise with respect.  

This commit was SVN r28582.
2013-06-04 19:36:16 +00:00
Joshua Ladd
61ffb47573 Minor fix for the min-dist mapping algorithm: we need to call 'get_nbobjs_by_type' first, before we get the sorted list of nodes - we need to add node objects and fill them in the summary object for the current topology. This patch was submitted by Elena Elkina and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28578.
2013-05-31 15:19:59 +00:00
Jeff Squyres
d1dc4da292 Fix typo (the debugger might not be TotalView).
This commit was SVN r28577.
2013-05-31 00:39:05 +00:00
Edgar Gabriel
87b3782b7f arghh, copy-and-paste error, status->_ucount has to be set to 0 not max_data for count=0.
This commit was SVN r28576.
2013-05-30 22:00:29 +00:00
Edgar Gabriel
9daec82f17 - make a fileview of 0 bytes work in ompio
- fixes the bug reported in ticket 3619 (which is already closed) also for ompio

This commit was SVN r28575.
2013-05-30 21:33:13 +00:00
Nathan Hjelm
e61a1aa865 Update LANL XE-6 platform files
This commit was SVN r28574.
2013-05-30 18:33:27 +00:00
George Bosilca
72877f078f Based on the MPI 3.0 count equal to zero has a clear meaning, no modification
of the original datatype are allowed (not in type map nor extent). Make it
clear in the code.
Allow 0-count cases to the contiguous memory check.

This commit was SVN r28568.
2013-05-29 16:02:54 +00:00
Jeff Squyres
f85dca0285 Fix spelling mistake that has been there for a long, long time...
This commit was SVN r28562.
2013-05-24 22:28:15 +00:00
Jeff Squyres
e339aa7561 Add missing svn:ignore on new mindist component
This commit was SVN r28561.
2013-05-24 15:58:47 +00:00
Rolf vandeVaart
3d1d158a80 Do not abort in BTL. Rather, callback into PML error function. Thanks George for review.
This commit was SVN r28559.
2013-05-23 18:45:23 +00:00
Brian Barrett
d4c33af0cd Update news to match the v1.7 branch
This commit was SVN r28558.
2013-05-23 17:56:12 +00:00
Rolf vandeVaart
7771857991 Adjust how cuda.h is found. It can be found in the with-cuda dir now.
This commit was SVN r28555.
2013-05-22 22:04:46 +00:00
George Bosilca
a9aae9c538 Patch based on Takahiro Kawashima fixing the issues with some
of the Fortran datatypes. This patch prevent the copy of the
datatype description from the OPAL to the OMPI layer in order
to decrease the memory requirements.

This commit was SVN r28553.
2013-05-22 18:35:21 +00:00
Jeff Squyres
6d173af329 This commit introduces a new "mindist" ORTE RMAPS mapper, as well as
some relevant updates/new functionality in the opal/mca/hwloc and
orte/mca/rmaps bases.  This work was mainly developed by Mellanox,
with a bunch of advice from Ralph Castain, and some minor advice from
Brice Goglin and Jeff Squyres.

Even though this is mainly Mellanox's work, Jeff is committing only
for logistical reasons (he holds the hg+svn combo tree, and can
therefore commit it directly back to SVN).

-----

Implemented distance-based mapping algorithm as a new "mindist"
component in the rmaps framework.  It allows mapping processes by NUMA
due to PCI locality information as reported by the BIOS - from the
closest to device to furthest.

To use this algorithm, specify:

   {{{mpirun --map-by dist:<device_name>}}}

where <device_name> can be mlx5_0, ib0, etc.

There are two modes provided:

 1. bynode: load-balancing across nodes
 1. byslot: go through slots sequentially (i.e., the first nodes are
     more loaded)

These options are regulated by the optional ''span'' modifier; the
command line parameter looks like:

    {{{mpirun --map-by dist:<device_name>,span}}}

So, for example, if there are 2 nodes, each with 8 cores, and we'd
like to run 10 processes, the mindist algorithm will place 8 processes
to the first node and 2 to the second by default. But if you want to
place 5 processes to each node, you can add a span modifier in your
command line to do that.

If there are two NUMA nodes on the node, each with 4 cores, and we run
6 processes, the mindist algorithm will try to find the NUMA closest
to the specified device, and if successful, it will place 4 processes
on that NUMA but leaving the remaining two to the next NUMA node.

You can also specify the number of cpus per MPI process. This option
is handled so that we map as many processes to the closest NUMA as we
can (number of available processors at the NUMA divided by number of
cpus per rank) and then go on with the next closest NUMA.

The default binding option for this mapping is bind-to-numa. It works
if you don't specify any binding policy. But if you specified binding
level that was "lower" than NUMA (i.e hwthread, core, socket) it would
bind to whatever level you specify.

This commit was SVN r28552.
2013-05-22 13:04:40 +00:00
Jeff Squyres
55382c1bf8 Bring over upstream hwloc trunk commit
https://svn.open-mpi.org/trac/hwloc/changeset/5592 to fix the merging
of groups when they are I/O objects.

This commit was SVN r28551.
2013-05-22 12:34:59 +00:00
Jeff Squyres
43a534a5c6 Revert r28544: the original code was fine.
This commit was SVN r28549.

The following SVN revision numbers were found above:
  r28544 --> open-mpi/ompi@c830d96673
2013-05-21 16:06:08 +00:00
Jeff Squyres
c830d96673 Silence compiler warnings, as suggested by Alan Sayre.
This commit was SVN r28544.
2013-05-21 13:42:18 +00:00
Jeff Squyres
52f4a10c21 Gang together with r28542: I missed a secondary location where .git
needed to be mentioned.

This commit was SVN r28543.

The following SVN revision numbers were found above:
  r28542 --> open-mpi/ompi@772284b89a
2013-05-20 17:11:53 +00:00
Jeff Squyres
772284b89a Clarify that we do debug builds for .svn, .hg, and .git.
This commit was SVN r28542.
2013-05-20 17:10:30 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00
Ralph Castain
889bf60c64 Fix bad merge
This commit was SVN r28540.
2013-05-18 01:29:55 +00:00
Jeff Squyres
089c632cce Remove a bunch of dead code: gcc 4.7 warns of set-but-unused
variables.  So get rid of them.

This commit was SVN r28538.
2013-05-17 21:45:49 +00:00
Edgar Gabriel
1b1051da6c fix a bug in the calculation of the explicit offset. Use the opportunity to
clean up the code a bit.

This commit was SVN r28537.
2013-05-17 20:22:00 +00:00
Jeff Squyres
f43fde5394 Gah! How did we end up without a greek string? :-(
This commit was SVN r28536.
2013-05-17 13:25:43 +00:00
George Bosilca
4d9f30fb05 Fix issue identified by Takahiro Kawashima regarding the overwriting
of the OPAL datatype descriptions upon MPI_Init. Now each layer (OPAL
and OMPI) uses it's own descriptions for the predefined datatypes,
thus preventing over-writing of the other layer data description.

This commit was SVN r28535.
2013-05-17 13:09:16 +00:00
Ralph Castain
e100b8d165 don't need the return value, but should check for error
This commit was SVN r28534.
2013-05-16 15:15:02 +00:00
Ralph Castain
3e6e1046a3 fix a correctness issue by returning an error if waitall fails and invoking the mpi error handler
cmr:v1.7.2:reviewer=jsquyres

This commit was SVN r28533.
2013-05-16 15:04:37 +00:00
Jeff Squyres
128cc27417 Minor type fix (they're both enums/ints, so the compiler previously
silently cast them).

This commit was SVN r28532.
2013-05-16 00:47:37 +00:00
Ralph Castain
93ba4247f8 remove extra paren when --without-hwloc
This commit was SVN r28530.
2013-05-15 21:31:45 +00:00
Ralph Castain
3a372a65b8 Mapping policies must be tested as equalities as they are values, not bitmasks
This commit was SVN r28526.
2013-05-15 13:45:00 +00:00
Ralph Castain
29e4b0cc50 Cannot test equality on mapping directives as it is a bitmask
This commit was SVN r28525.
2013-05-15 13:41:49 +00:00
Matthias Jurenz
ef0a080028 Changes to VT:
- fixed compiler warnings when compiling for 32-bit
	- MPI wrapper generator scripts:
		- removed non-posix call to length(array)
		- exit scripts if any statement returns a non-true return value (set -e)

This commit was SVN r28524.
2013-05-15 10:44:51 +00:00
Ralph Castain
1ec13d530c Allow simple way to request comparison to full address regardless of addr family
This commit was SVN r28519.
2013-05-14 22:08:39 +00:00
Ralph Castain
eb2edb4b2b Silence warning
This commit was SVN r28516.
2013-05-14 22:00:01 +00:00
Ralph Castain
04b11accd3 Silience a few warnings
This commit was SVN r28515.
2013-05-14 21:58:40 +00:00
Rolf vandeVaart
8a8ea9ba1b Fix compile error in optimize build for CUDA-aware code.
This commit was SVN r28512.
2013-05-14 21:07:27 +00:00
Rolf vandeVaart
91fdb423d7 Fix warning in CUDA-aware code.
This commit was SVN r28511.
2013-05-14 21:04:15 +00:00
Rolf vandeVaart
52ebb0b17f Change some opal_output to OPAL_OUTPUT per CMR review.
This commit was SVN r28510.
2013-05-14 20:49:42 +00:00
Ralph Castain
5296099ecb Fix the cpus-per-rank when binding to hwthreads. Add cpus-per-rank to diag printout
Thanks to Elena for reporting the problem

This commit was SVN r28508.
2013-05-14 20:17:50 +00:00