1
1
Граф коммитов

19501 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
ca0c806662 Resolve the problem of binding in inverted topologies - check the relative depth of the map and bind objects in the topology, and let that determine whether we bind downward or upwards.
cmr=v1.7.5:reviewer=jsquyres:subject=Resolve the problem of binding in inverted topologies

This commit was SVN r30643.
2014-02-09 05:30:17 +00:00
Ralph Castain
0ee38353ba In case there are stale session directories around, do a purge of the relevant session directory tree when an orted, HNP, or singleton start. This won't help in the case of direct-launched apps, but it's the best we can do.
cmr=v1.7.5:reviewer=jsquyres:subject=purge stale session dirs at startup

This commit was SVN r30642.
2014-02-09 02:10:31 +00:00
Ralph Castain
1d8c061687 Fix a race condition that could result in assert failures during finalize. Ensure we shutdown the orte progress thread prior to finalizing the rml/oob frameworks so that no async operations are executing during destruct of the base-level lists and objects.
cmr=v1.7.5:reviewer=jsquyres:subject=fix race condition in finalize

This commit was SVN r30641.
2014-02-08 22:04:19 +00:00
Ralph Castain
5b8e1180cf Update a test
This commit was SVN r30640.
2014-02-08 22:00:12 +00:00
Mike Dubman
10f4bd4280 add help for --with-hcoll
Added by Josh, reviewed by Mike
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30637.
2014-02-08 18:56:18 +00:00
Ralph Castain
a94920276d Fix singleton MPI_Abort. Singletons no longer immediately start an HNP, but only launch one when they need it for comm_spawn. So there isn't anyone to send the "abort" report to, and thus we just exit after emitting our message.
cmr=v1.7.5:reviewer=jsquyres:subject=Fix singleton MPI_Abort

This commit was SVN r30635.
2014-02-08 18:15:07 +00:00
Nathan Hjelm
98ad6b3d1e bcol/basesmuma: fix initialization on 32-bit platforms
The initialization code did several allgathers on void *'s using
MPI_LONG_LONG_INT. This will produce the wrong result on 32-bit
platforms. Instead use MPI_BYTE with count = sizeof (void *).

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30627.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-08 00:00:30 +00:00
Nathan Hjelm
a8867a9ca4 btl/vader: fix 32-bit support
cmr=v1.7.5:ticket=trac:4053

This commit was SVN r30626.

The following Trac tickets were found above:
  Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
2014-02-07 23:57:36 +00:00
Nathan Hjelm
77869c3232 bcol/basesmuma: fix several bugs in the basesmuma code
Found two bugs in basesmuma:

 - Release all resources when tearing down the bcol module.

 - Allways call the allreduce in the smcm code. We do not know
   beforehand whether all procs have all the files mapped.

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30623.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-07 21:39:24 +00:00
Ralph Castain
bc7cc09749 After a lot of pain, I've managed to resolve the problem of conflicting mapping directives caused by mismatched MCA params - i.e., where someone has one variant of an MCA param (e.g., rmaps_base_mapping_policy) in their default MCA param file, and then specifies another variant (e.g., --npernode) on the command line. I can't fully resolve the problem as there is no way to know precisely what the user meant - we can only guess which param was really intended since the MCA param system
can't apply its normal precedence rules.

So...print a big "deprecated" warning for the old params and error out if a conflict is detected. I know that isn't what people really wanted, but it's the best we
 can do. If only the old style param is given, then process it after the warning.

Extend the current map-by param to add support for ppr and cpus-per-proc, adding the latter to the list of allowed modifiers using "pe=n" for processing elements/proc. Thus, you can map-by socket:pe=2,oversubscribe to map by socket, binding 2 processing elements/process, with oversubscription allowed. Or you can map-by ppr:2:socket:pe=4 to map two processes to every socket in the allocation, binding each process to 4 processing elements.

For those wondering, a processing element is defined as a hwthread if --use-hwthreads-as-cpus is given, or else as a core.

Refs trac:4117

This commit was SVN r30620.

The following Trac tickets were found above:
  Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
2014-02-07 21:25:40 +00:00
Pavel Shamis
3a683419c5 Fixing broken dependency between ML/BCOLS
This is hot-fix patch for the issue reported by Ralph. 
In future we plan to restructure ml data structure layout.

Tested by Nathan.

cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30619.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-07 19:15:45 +00:00
Jeff Squyres
6f8e76df7e Revert r30539 and r30540; using the sqrt() to limit the computation is
just plain wrong (i.e., it gives wrong answers).  

When time permits, perhaps we can put in a better algorithm for
MPI_DIMS_CREATE (Andreas Schäfer mentioned that nnodes can now be on
the order of millions, and the current algorithm is... inefficient, at
best).

This commit was SVN r30606.

The following SVN revision numbers were found above:
  r30539 --> open-mpi/ompi@fb67d98867
  r30540 --> open-mpi/ompi@4417ed2133
2014-02-07 13:46:48 +00:00
Ralph Castain
74d3393a4f Revert r30600, r30602-30604 as the first one broke the tarball and the others couldn't fix it
This commit was SVN r30605.

The following SVN revision numbers were found above:
  r30600 --> open-mpi/ompi@7d2c4cb468
  r30602 --> open-mpi/ompi@9e751a0302
  r30604 --> open-mpi/ompi@3012c280cf

Revision number ranges (suitable for "git log"):
  r30602-30604 --> open-mpi/ompi@9e751a03^..3012c280
2014-02-07 04:38:06 +00:00
Ralph Castain
3012c280cf I surrender - this code is just too interbred with other components for me to clean up, so turn it off for now
This commit was SVN r30604.
2014-02-07 04:16:21 +00:00
Ralph Castain
3954311bac We have rules about not cross-integrating components, even across frameworks - please follow them.
This commit was SVN r30603.
2014-02-07 03:46:45 +00:00
Ralph Castain
9e751a0302 You absolutely, positively *cannot* include a header file from a component in the base functions!
This commit was SVN r30602.
2014-02-07 03:27:06 +00:00
Nathan Hjelm
a06e491c2c ob1: large buffered sends were broken by the ob1 optimizations. fix them
The problem was caused by the static request optimization. The buffered send case
is much like the isend case in that the request structure may be needed after
MPI_Bsend completes. Fix this case by calling isend and freeing the resulting
request.

cmr=v1.7.5:ticket=trac:4149

This commit was SVN r30601.

The following Trac tickets were found above:
  Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
2014-02-07 00:12:36 +00:00
Jeff Squyres
7d2c4cb468 There's a few ml-related bugs outstanding, and Nathan is looking into
them, but it's going to take a little time (at least one day).  So
Nathan says it's ok to .ompi_ignore coll ml until he's able to fix it.

This commit was SVN r30600.
2014-02-06 23:51:03 +00:00
George Bosilca
32a494e73b When CXX support is disabled don't check if coverage is supported. The
problem is that ompi_cxx_vendor is only defined when MPI CXX support
is enabled.

This commit was SVN r30599.
2014-02-06 21:40:26 +00:00
Nathan Hjelm
3902cf66f1 ob1: OBJ_CONSTRUCT the convertor in the send_inline optimization.
This change does not appear to increase the small message latency of ping-pong
benchmarks and fixes an issue found by our ibm datatype tests.

Fixes trac:4232

cmr=v1.7.5:ticket=trac:4149

This commit was SVN r30598.

The following Trac tickets were found above:
  Ticket 4149 --> https://svn.open-mpi.org/trac/ompi/ticket/4149
  Ticket 4232 --> https://svn.open-mpi.org/trac/ompi/ticket/4232
2014-02-06 21:27:42 +00:00
Nathan Hjelm
a41cb1f086 Remove duplicate definition of xpmem_apid_t
cmr=v1.7.5:ticket=trac:4216

This commit was SVN r30589.

The following Trac tickets were found above:
  Ticket 4216 --> https://svn.open-mpi.org/trac/ompi/ticket/4216
2014-02-06 20:38:20 +00:00
Jeff Squyres
12a4d1a27f Minor update to r30430: put the variables at the top of the function
instead of making an inner block.

Refs trac:4185

This commit was SVN r30588.

The following SVN revision numbers were found above:
  r30430 --> open-mpi/ompi@ea3cb1e110

The following Trac tickets were found above:
  Ticket 4185 --> https://svn.open-mpi.org/trac/ompi/ticket/4185
2014-02-06 18:37:19 +00:00
Jeff Squyres
fad3cbf639 Revert r30571.
This commit was SVN r30587.

The following SVN revision numbers were found above:
  r30571 --> open-mpi/ompi@081b679881
2014-02-06 18:35:30 +00:00
Jeff Squyres
ef4e65bd2c Very small configure.ac reorganization:
* Move pid_t size check up to be with the other size checks
 * Move the MPI profiling setup to be below the Java setup

This commit was SVN r30574.
2014-02-06 11:42:34 +00:00
Mike Dubman
28949efcaf OSHMEM: MXM2: a2a perf improvement on large scale
Allow only limited number of coonections to have 'puts'
that do not require remote completion ack. That will
greatly improve performance of shmem_fence()/shmem_quiet()
and shmem_barrier() when there are many active connections.

fixed by Alex, reviewed by Miked

Refs trac:3763

This commit was SVN r30573.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-02-06 08:42:45 +00:00
Mike Dubman
27a763c86c OSHMEM: finalization fixes
Fixes mxm endpoint destruction and hung during
SHMEM finalization.

Add a barrier between spml del procs and finalization.
Not having it caused hungs because ikrit spml can not properly
disconnect if its peer already finizalized.

Refs trac:3763

This commit was SVN r30572.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-02-06 08:40:43 +00:00
Mike Dubman
081b679881 OMPI: add call to del_procs
fixed by AlexM, reviewed by miked
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30571.
2014-02-06 08:38:32 +00:00
Ralph Castain
c617d66d98 Paul Hargrove has pointed out that some big SMP systems (e.g., from SGI) configure Torque differently - instead of listing each node name once/slot in the nodefile, they list the node only once and set an envar to indicate the number of procs/node being allocated. Add an MCA param users can set to indicate we are in such an environment, and then use the envar to set the slots. Error out if the mode flag is given, but (a) we don't find the PBS_PPN envar, or (b) we find a node actually listed more than once in the PBS_Nodefile.
cmr=v1.7.5:reviewer=jsquyres:subject=Support SMP mode in Torque

This commit was SVN r30568.
2014-02-05 15:51:17 +00:00
Ralph Castain
78e1846b4b Add further clarification regarding new "test" APIs
This commit was SVN r30567.
2014-02-05 15:48:31 +00:00
George Bosilca
6ee06b7fda No exit down into a BTL.
This commit was SVN r30566.
2014-02-05 15:04:01 +00:00
Ralph Castain
1326ed704f Per the RFC discussed here:
http://www.open-mpi.org/community/lists/devel/2014/01/13789.php

add support for async modex when requested.

cmr=v1.7.5:reviewer=jsquyres:subject=Add async modex support

This commit was SVN r30565.
2014-02-05 14:39:27 +00:00
Jeff Squyres
b7d10b3499 Remove some unused AC_DEFINE's, and also remove a redundant AC_DEFINE
This commit was SVN r30564.
2014-02-05 12:24:02 +00:00
Jeff Squyres
45b938848e Sync with 1.7 NEWS bullets
This commit was SVN r30562.
2014-02-04 22:33:13 +00:00
Nathan Hjelm
4248cb1d9c Fix typo and portability issue in r30555
cmr=v1.7.4:ticket=trac:4223

This commit was SVN r30559.

The following SVN revision numbers were found above:
  r30555 --> open-mpi/ompi@5c35b5ba19

The following Trac tickets were found above:
  Ticket 4223 --> https://svn.open-mpi.org/trac/ompi/ticket/4223
2014-02-04 20:15:32 +00:00
Joshua Ladd
1dbd8688db This fixes a long standing bug in the OpenIB BTL's MCA param intialization.Only caught if BTL_OPENIB_FAILOVER_ENABLED. Thanks to Jeff for spotting. This should be added to:
cmr=v1.7.4:reviewer=jsquyres
cmr=v1.6.6

This commit was SVN r30558.
2014-02-04 20:01:39 +00:00
Jeff Squyres
9ba6c6fe41 Add missing header file
This commit was SVN r30556.
2014-02-04 19:50:02 +00:00
Nathan Hjelm
5c35b5ba19 Fix wrapper ldflags.
cmr=v1.7.4:reviewer=jsquyres

This commit was SVN r30555.
2014-02-04 19:44:08 +00:00
Ralph Castain
230336b6a8 Upgrade the security framework to avoid multiple hits against the global security server. Add support for future case where mpirun assings a global security credential for a given run, though we need to work out how to handle connect-accept from other mpirun's in that case. Remove a bunch of duplicate code in the OOB by consolidating the connection handshake code.
Refs trac:4221

This commit was SVN r30554.

The following Trac tickets were found above:
  Ticket 4221 --> https://svn.open-mpi.org/trac/ompi/ticket/4221
2014-02-04 14:47:04 +00:00
Mike Dubman
3d8c06d1b4 fix min.supported version for mxm check
reviewed by Alex
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30553.
2014-02-04 14:45:47 +00:00
Adrian Reber
fde1040d2f Use unique collective ids for the checkpoint/restart code
This commit was SVN r30552.
2014-02-04 14:03:05 +00:00
Ralph Castain
5980b7e042 Add a security framework for authenticating connections - we will add LDAP, Kerberos, and Keystone support in the next month. For now, just put a placeholder "basic" module that does the minimum.
Wire the security check into ORTE's OOB handshake, and add a "version" check to ensure that both ends are from the same ORTE version. If not, report the mismatch and refuse the connection

Fixes trac:4171

cmr=v1.7.5:reviewer=jsquyres:subject=Add a security framework for authenticating connections

This commit was SVN r30551.

The following Trac tickets were found above:
  Ticket 4171 --> https://svn.open-mpi.org/trac/ompi/ticket/4171
2014-02-04 01:38:45 +00:00
Ralph Castain
e43589ed84 Fix warning - thanks to Paul Hargrove for reporting it
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r30548.
2014-02-03 23:51:45 +00:00
Jeff Squyres
d9786c42f7 Addendum to r30531:
* Fix some comments
 * Fix some spacing in the non-verbose "make" output
 * Make javadoc non-verbose output like other non-verbose output
 * Remove the use of JAVA_CLASS_FILES; it wasn't correct any way (it
   both derived names from JAVA_SRC_FILES ''and'' used mpi/*.class, so
   many files were listed twice)
 * Move the generation of javadoc files to "make" time (vs. "make
   install" time) by putting the "doc" subdirectory in BUILT_SOURCES
 * Make doc dependent upon mpi/MPI.class, not mpi.jar -- we only need
   the classes to exist, not the final jarfile.
 * Make jdoc-install dependent upon a real build artifact (the doc
   dir), not an artificial name that will never exist (jdoc)
 * Separate the removal of the doc (and mpi) subdirectories during
   "make clean" off into the clean-local target, because CLEANFILES
   can really only had ''files'' added to it.

These changes also fix parallel builds.

cmr=v1.7.5:ticket=trac:4214

This commit was SVN r30547.

The following SVN revision numbers were found above:
  r30531 --> open-mpi/ompi@6ca8e68e4b

The following Trac tickets were found above:
  Ticket 4214 --> https://svn.open-mpi.org/trac/ompi/ticket/4214
2014-02-03 22:32:45 +00:00
Ralph Castain
9514858067 As Rolf pointed out, this patch wasn't needed on the trunk - just the 1.7 branch. Sigh
This commit was SVN r30544.
2014-02-03 21:40:56 +00:00
Ralph Castain
4d533c81fb Minor cleanup required when configuring with an external libevent. Thanks to Orion Poplawski for the patch!
cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r30543.
2014-02-03 21:03:05 +00:00
Ralph Castain
fab35dbffa Silence some Solaris warnings reported by Paul Hargrove
cmr=v1.7.4:reviewer=jsquyres:subject=Silence some Solaris warnings

This commit was SVN r30542.
2014-02-03 19:46:08 +00:00
Jeff Squyres
fa02bba7c5 Remove a bunch of extra whitespace.
Thanks to Andreas Schäfer for the original patch.

This commit was SVN r30541.
2014-02-03 19:30:43 +00:00
Jeff Squyres
4417ed2133 Gah; I missed the #include in r30539.
cmr=v1.7.5:ticket=trac:4217

This commit was SVN r30540.

The following SVN revision numbers were found above:
  r30539 --> open-mpi/ompi@fb67d98867

The following Trac tickets were found above:
  Ticket 4217 --> https://svn.open-mpi.org/trac/ompi/ticket/4217
2014-02-03 19:28:07 +00:00
Jeff Squyres
fb67d98867 Suggestion from Andreas Schäfer: we really only need sqrt(freeprocs)
primes.  This considerably reduces the computational load when
freeprocs is large.

cmr=v1.7.5:reviewer=hjelmn:subject=MPI_Dims_create optimization

This commit was SVN r30539.
2014-02-03 19:21:04 +00:00
Nathan Hjelm
12f0bf9488 basesmuma: missed a couple of MB references
cmr=v1.7.5:ticket=trac:4158

This commit was SVN r30538.

The following Trac tickets were found above:
  Ticket 4158 --> https://svn.open-mpi.org/trac/ompi/ticket/4158
2014-02-03 18:19:53 +00:00