1
1
Граф коммитов

19536 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a8a9801a0b Ensure an orted exits with non-zero status if it is unable to send a message. Add more diagnostic messages to the OOB set_addr code
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r30701.
2014-02-12 19:44:01 +00:00
Ralph Castain
d0e8aeaee4 Add the time_t datatype to the DSS
This commit was SVN r30700.
2014-02-12 19:37:21 +00:00
Christoph Niethammer
010a806a58 Omit usage of pre calculated prime numbers and factorize directly.
Optimization of the MPI_Dims_create function which omits the usage of pre
calculated prime numbers and factorize directly as discussed at the developer
list.

cmr=v1.7.5:ticket=4217:reviewer=jsquyres

This commit was SVN r30695.

The following Trac tickets were found above:
  Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
2014-02-12 08:47:33 +00:00
Christoph Niethammer
85dce869c8 Move parameter check into appropriate code section at the begin.
Freeprocs variable was obtained from nnodes, so check the value of nnodes at
the begin in the MPI_PARAM_CHECK code section instead as discussed at the
developer list.

cmr=v1.7.5:reviewer=jsquyres:subject=move parameter check to begin


jsquyres, please review this CMR. Thanks.

This commit was SVN r30694.
2014-02-12 08:30:13 +00:00
Ralph Castain
1473dde6ea Okay, once again be caught by the blasted hwloc inability to cleanly handle caches. Protect the calls to get_depth by first checking to see if it is a "cache", then use a cache-specific function to get the stupid data. Very, very irritating.
cmr=v1.7.5:reviewer=jsquyres:subject=treat caches as something different yet again

This commit was SVN r30693.
2014-02-12 01:45:06 +00:00
Ralph Castain
3e12466f60 Ouch - fix bad race condition in direct modex
cmr=v1.7.5:reviewer=hjelmn:subject=fix bad race condition in direct modex

This commit was SVN r30691.
2014-02-11 23:21:27 +00:00
Ralph Castain
39dcbbe883 Improve the script to ignore executables and Mac-specific files of no interest
This commit was SVN r30690.
2014-02-11 22:53:14 +00:00
Ralph Castain
1565816988 Do a little better job of cleaning up the session directory left by mpirun by ensuring we delete the event associated with debugger attachment and unlinking the pipe used for that purpose. Also, we no longer leave "abort" files around, so remove that check when deleting session directory trees
cmr=v1.7.5:reviewer=jsquyres:subject=cleanup session directories better

This commit was SVN r30689.
2014-02-11 22:16:17 +00:00
Ralph Castain
fa7b686ccc Provide better messages when we don't find any included interfaces, and/or don't find any interfaces for use by OOB.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r30675.
2014-02-11 19:29:03 +00:00
Dave Goodell
72c0b89e8f usnic: handle missing ibv_event_type_str
Some older versions of libibverbs do not have `ibv_event_type_str`,
leading to compilation failures on older machines, irrespective of
whether they could ever support usNIC anyway.  If we encounter any other
build issues related to "old verbs" then we should just cause the usnic
BTL to disqualify itself when it encounters "old" traits.

Thanks to Paul Hargrove for reporting the issue:
http://www.open-mpi.org/community/lists/devel/2014/02/14056.php

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30674.
2014-02-11 19:18:29 +00:00
Jeff Squyres
173dd6d503 Set the VERSION file back to what it's supposed to be:
- request SVN version number
- say "Unreleased developer version"

This commit was SVN r30673.
2014-02-11 18:33:24 +00:00
Ralph Castain
b566cd5e30 Protect against no modifiers
Refs trac:4117

This commit was SVN r30672.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-11 17:34:37 +00:00
Ralph Castain
6fa34407bf Handle modifiers to the --map-by dist option
Refs trac:4117

This commit was SVN r30671.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-11 17:19:05 +00:00
Ralph Castain
008a4e2e35 Set ignores
This commit was SVN r30670.
2014-02-11 17:18:14 +00:00
Nathan Hjelm
6194bb502a vader: attempt to work around SGI UV issues by creating a segment that
only goes up to VADER_MAX_ADDRESS instead of 0xfffffffffffffffful.

cmr=v1.7.5:ticket=trac:4216

This commit was SVN r30669.

The following Trac tickets were found above:
  Ticket 4216 --> https://svn.open-mpi.org/trac/ompi/ticket/4216
2014-02-11 16:28:25 +00:00
Nathan Hjelm
f2f6a7fe81 vader: don't finalize an endpoint that is already finalized
Fixes trac:4252

cmr=v1.7.5:ticket=trac:4053

This commit was SVN r30668.

The following Trac tickets were found above:
  Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
  Ticket 4252 --> https://svn.open-mpi.org/trac/ompi/ticket/4252
2014-02-11 16:15:29 +00:00
Adrian Reber
56c23f22be CRS/SELF: fix compiler warning: variable 'callback_matched' set but not used
This commit was SVN r30667.
2014-02-11 14:45:29 +00:00
Adrian Reber
42d7b014e4 CRS/CRIU: added CRIU as new CRS component
To be able to checkpoint/restart using criu (criu.org) a new
CRS component is added which is based on criu. This first commit
provides the minimal set of functions and configure script options
to enable --with-criu and link against libcriu.so.
No actual checkpoint/restart functionality is yet implemented.
This is only the framework which needs to be filled with the
actual functionality.

This commit was SVN r30666.
2014-02-11 14:43:38 +00:00
Jeff Squyres
db41d749c1 Remove ASYNCHRONOUS from the ignore TKR mpi_f08 module.
It turns out that ASYNCHRONOUS should not be used with ignore TKR
dummy parameters (some compilers will [correctly] warn about this).

Many thanks to Rolf Rabenseifner and Christoph Niethammer, who noticed
the problem.

Submitted by Rolf Rabenseifner, reviewed by Jeff.

cmr=v1.7.5:reviewer=ompi-rm1.7:subject=Remove ASYNCHRONOUS from the ignore TKR mpi_f08 module.

This commit was SVN r30665.
2014-02-11 13:19:30 +00:00
Ralph Castain
2fe39756ea Update ignores
This commit was SVN r30664.
2014-02-11 03:07:36 +00:00
Ralph Castain
4781ea71b6 Correct the handling of various map/bind combinations when pe=N is given. Thanks to Elena Elkina for reporting it.
Refs trac:4117

This commit was SVN r30663.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-11 03:05:26 +00:00
Ralph Castain
707e51d786 Check for --cpus-per-proc earlier, before the correct option can be processed. Thanks to Tetsuya Mishima for reporting it.
Refs trac:4117

This commit was SVN r30662.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-11 02:53:53 +00:00
Ralph Castain
d66d2f5fb3 It is just fine to map by node or slot and bind, so ensure the switch statement includes those options. Thanks to Tatsuya Mishima for point it out.
Refs trac:4240

This commit was SVN r30661.

The following Trac tickets were found above:
  Ticket 4240 --> https://svn.open-mpi.org/trac/ompi/ticket/4240
2014-02-11 02:52:01 +00:00
Jeff Squyres
b4effd200f Remove num_pes,my_pe declarations from shmem.fh.
The openshmem test suite
(http://bongo.cs.uh.edu/site/sites/default/site_files/openshmem-test-suite-release-1.0d.tar.bz2)
declares num_pes and my_pe in each Fortran test file -- it apparently
doesn't expect shmem.fh to declare these functions.  Sigh.

cmr=v1.7.5:reviewer=miked:subject=the openshmem community is crazy

This commit was SVN r30660.
2014-02-11 01:36:26 +00:00
Ralph Castain
86e8a147c6 Resolve uninitialized variables on some systems. Thanks to Paul Hargrove for finding the problem and suggesting the patch.
cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r30656.
2014-02-10 21:17:34 +00:00
Ralph Castain
7576d5964a Remove c++ cruft from configure
Refs trac:4247

This commit was SVN r30655.

The following Trac tickets were found above:
  Ticket 4247 --> https://svn.open-mpi.org/trac/ompi/ticket/4247
2014-02-10 19:23:39 +00:00
Ralph Castain
8d04d7408e Sigh - remove the c++ cruft from here too
Refs trac:4247

This commit was SVN r30654.

The following Trac tickets were found above:
  Ticket 4247 --> https://svn.open-mpi.org/trac/ompi/ticket/4247
2014-02-10 17:32:02 +00:00
Ralph Castain
a49e0db8dd We haven't supported a c++ wrapper for ORTE in quite some time
cmr=v1.7.5:reviewer=ompi-gk1.7:subject=remove c++ cruft

This commit was SVN r30653.
2014-02-10 17:16:30 +00:00
Nathan Hjelm
f45364746e vader: fix typos in r30626
cmr=v1.7.5:ticket=trac:4053

This commit was SVN r30652.

The following SVN revision numbers were found above:
  r30626 --> open-mpi/ompi@a8867a9ca4

The following Trac tickets were found above:
  Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
2014-02-10 16:15:43 +00:00
Nathan Hjelm
6dd29a05f1 basesmuma: Fix typos in r30627
cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30651.

The following SVN revision numbers were found above:
  r30627 --> open-mpi/ompi@98ad6b3d1e

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-10 16:15:37 +00:00
Jeff Squyres
d6184172e9 Remove extra blank line.
This commit was SVN r30650.
2014-02-10 14:50:00 +00:00
Ralph Castain
1a12325094 Rats - need to include bydist in the mapping list
Refs trac:4117

This commit was SVN r30649.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-09 16:17:05 +00:00
Ralph Castain
4e32a82638 If we are binding to hwthreads, then we need to treat hwthreads as cpus to get the mapping right
cmr=v1.7.5:reviewer=jsquyres:subject=set hwthreads to cpus when binding to them

This commit was SVN r30648.
2014-02-09 16:14:38 +00:00
Ralph Castain
0dc5f50d27 Add a plm component for local-only operation that doesn't require rsh/ssh to be installed. Requested by Fedora packagers for testing purposes.
cmr=v1.7.5:reviewer=jsquyres:subject=Add a plm component for local-only operation

This commit was SVN r30645.
2014-02-09 15:53:10 +00:00
Alex Margolin
636493393c OPENIB: Fixed error from writing to an uninitialized pipe.
The error was caused by leaving the pipe to the async thread uninitialized, then writing to it regardless of this. 
Fix is to check the existance of the async thread and the pipe to it.

reviewd by miked

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30644.
2014-02-09 14:07:14 +00:00
Ralph Castain
ca0c806662 Resolve the problem of binding in inverted topologies - check the relative depth of the map and bind objects in the topology, and let that determine whether we bind downward or upwards.
cmr=v1.7.5:reviewer=jsquyres:subject=Resolve the problem of binding in inverted topologies

This commit was SVN r30643.
2014-02-09 05:30:17 +00:00
Ralph Castain
0ee38353ba In case there are stale session directories around, do a purge of the relevant session directory tree when an orted, HNP, or singleton start. This won't help in the case of direct-launched apps, but it's the best we can do.
cmr=v1.7.5:reviewer=jsquyres:subject=purge stale session dirs at startup

This commit was SVN r30642.
2014-02-09 02:10:31 +00:00
Ralph Castain
1d8c061687 Fix a race condition that could result in assert failures during finalize. Ensure we shutdown the orte progress thread prior to finalizing the rml/oob frameworks so that no async operations are executing during destruct of the base-level lists and objects.
cmr=v1.7.5:reviewer=jsquyres:subject=fix race condition in finalize

This commit was SVN r30641.
2014-02-08 22:04:19 +00:00
Ralph Castain
5b8e1180cf Update a test
This commit was SVN r30640.
2014-02-08 22:00:12 +00:00
Mike Dubman
10f4bd4280 add help for --with-hcoll
Added by Josh, reviewed by Mike
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30637.
2014-02-08 18:56:18 +00:00
Ralph Castain
a94920276d Fix singleton MPI_Abort. Singletons no longer immediately start an HNP, but only launch one when they need it for comm_spawn. So there isn't anyone to send the "abort" report to, and thus we just exit after emitting our message.
cmr=v1.7.5:reviewer=jsquyres:subject=Fix singleton MPI_Abort

This commit was SVN r30635.
2014-02-08 18:15:07 +00:00
Nathan Hjelm
98ad6b3d1e bcol/basesmuma: fix initialization on 32-bit platforms
The initialization code did several allgathers on void *'s using
MPI_LONG_LONG_INT. This will produce the wrong result on 32-bit
platforms. Instead use MPI_BYTE with count = sizeof (void *).

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30627.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-08 00:00:30 +00:00
Nathan Hjelm
a8867a9ca4 btl/vader: fix 32-bit support
cmr=v1.7.5:ticket=trac:4053

This commit was SVN r30626.

The following Trac tickets were found above:
  Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
2014-02-07 23:57:36 +00:00
Nathan Hjelm
77869c3232 bcol/basesmuma: fix several bugs in the basesmuma code
Found two bugs in basesmuma:

 - Release all resources when tearing down the bcol module.

 - Allways call the allreduce in the smcm code. We do not know
   beforehand whether all procs have all the files mapped.

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30623.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-07 21:39:24 +00:00
Ralph Castain
bc7cc09749 After a lot of pain, I've managed to resolve the problem of conflicting mapping directives caused by mismatched MCA params - i.e., where someone has one variant of an MCA param (e.g., rmaps_base_mapping_policy) in their default MCA param file, and then specifies another variant (e.g., --npernode) on the command line. I can't fully resolve the problem as there is no way to know precisely what the user meant - we can only guess which param was really intended since the MCA param system
can't apply its normal precedence rules.

So...print a big "deprecated" warning for the old params and error out if a conflict is detected. I know that isn't what people really wanted, but it's the best we
 can do. If only the old style param is given, then process it after the warning.

Extend the current map-by param to add support for ppr and cpus-per-proc, adding the latter to the list of allowed modifiers using "pe=n" for processing elements/proc. Thus, you can map-by socket:pe=2,oversubscribe to map by socket, binding 2 processing elements/process, with oversubscription allowed. Or you can map-by ppr:2:socket:pe=4 to map two processes to every socket in the allocation, binding each process to 4 processing elements.

For those wondering, a processing element is defined as a hwthread if --use-hwthreads-as-cpus is given, or else as a core.

Refs trac:4117

This commit was SVN r30620.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-07 21:25:40 +00:00
Pavel Shamis
3a683419c5 Fixing broken dependency between ML/BCOLS
This is hot-fix patch for the issue reported by Ralph. 
In future we plan to restructure ml data structure layout.

Tested by Nathan.

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30619.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-07 19:15:45 +00:00
Jeff Squyres
6f8e76df7e Revert r30539 and r30540; using the sqrt() to limit the computation is
just plain wrong (i.e., it gives wrong answers).  

When time permits, perhaps we can put in a better algorithm for
MPI_DIMS_CREATE (Andreas Schäfer mentioned that nnodes can now be on
the order of millions, and the current algorithm is... inefficient, at
best).

This commit was SVN r30606.

The following SVN revision numbers were found above:
  r30539 --> open-mpi/ompi@fb67d98867
  r30540 --> open-mpi/ompi@4417ed2133
2014-02-07 13:46:48 +00:00
Ralph Castain
74d3393a4f Revert r30600, r30602-30604 as the first one broke the tarball and the others couldn't fix it
This commit was SVN r30605.

The following SVN revision numbers were found above:
  r30600 --> open-mpi/ompi@7d2c4cb468
  r30602 --> open-mpi/ompi@9e751a0302
  r30604 --> open-mpi/ompi@3012c280cf

Revision number ranges (suitable for "git log"):
  r30602-30604 --> open-mpi/ompi@9e751a03^..3012c280
2014-02-07 04:38:06 +00:00
Ralph Castain
3012c280cf I surrender - this code is just too interbred with other components for me to clean up, so turn it off for now
This commit was SVN r30604.
2014-02-07 04:16:21 +00:00
Ralph Castain
3954311bac We have rules about not cross-integrating components, even across frameworks - please follow them.
This commit was SVN r30603.
2014-02-07 03:46:45 +00:00