1
1
Граф коммитов

19906 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
20af8339e6 osc/base: add support for datatypes that are a contiguous combination
of the primitive datatype

In this case we can not use the convertor to run the accumulate operation
since the datatype is a more or less a primitive type.

cmr=v1.8:ticket=trac:4449

This commit was SVN r31222.

The following Trac tickets were found above:
  Ticket 4449 --> https://svn.open-mpi.org/trac/ompi/ticket/4449
2014-03-25 21:00:26 +00:00
Nathan Hjelm
d681eb4655 osc/rdma: fix warnings introduced by r31204
cmr=v1.8:ticket=trac:4449

This commit was SVN r31221.

The following SVN revision numbers were found above:
  r31204 --> open-mpi/ompi@949abe45cd

The following Trac tickets were found above:
  Ticket 4449 --> https://svn.open-mpi.org/trac/ompi/ticket/4449
2014-03-25 21:00:19 +00:00
Jeff Squyres
f0712076a0 Update the nightly script to start building v1.8 tarballs
This commit was SVN r31208.
2014-03-25 16:19:15 +00:00
Nathan Hjelm
949abe45cd osc: fix datatype related issues in the one-sided code
This commit fixes two issues:

 - osc/rdma: The target side of an accumulate was using the target datatype
   in the receive to the packed buffer. This was conflicting with the way
   the reduction is done into the target buffer. Changed the receive to use
   the primitive datatype.

 - osc/base: The copy table was completely wrong. Fixed the table to match
   the underlying datatypes (which are opal not ompi datatypes).

 - osc/base: There is a problem using the optimized description. Fall back
   on using the non-optimized description until we can understand what is
   going wrong.

cmr=v1.8:reviewer=jsquyres

This commit was SVN r31204.
2014-03-25 15:28:48 +00:00
Nathan Hjelm
bc55276844 osc/rdma: fix bug in the active message code that could cause erroneous
results

The code to handle completion messages did not correctly increment the
number of expected messages. This could cause wait to return before all
incoming messages are complete.

I also added a check to ensure that start returns an error if we are in
a passive access epoch.

cmr=v1.8:reviewer=jsquyres

This commit was SVN r31203.
2014-03-25 15:28:36 +00:00
Mike Dubman
be3fc7bf20 OSHMEM: better error messages when failing
Provide users with right fail reason.

fixes trac:4433

This commit was SVN r31202.

The following Trac tickets were found above:
  Ticket 4433 --> https://svn.open-mpi.org/trac/ompi/ticket/4433
2014-03-25 15:27:13 +00:00
Jeff Squyres
8c2b9658ce Commit upstream ROMIO fix: dbad7873926a75adbff0fd0140ae321412f70d66
ROMIO code assumes all processes will use the same ROMIO driver.  we
were not reaching the "find a common file system" logic when NFS was
enabled, everyone stat-ed the file system without errors, but some
processees found a different file system (like if some processes are
writing to NFS and others to UFS)

See discussion beginning here:
http://lists.mpich.org/pipermail/discuss/2014-March/002403.html

Tested-by: Jeff Squyres <jsquyres@cisco.com>

Submitted by Rob Lathan, reviewed by Jeff Squyres

cmr=v1.8:reviewer=ompi-rm1.8

This commit was SVN r31201.
2014-03-25 14:50:07 +00:00
Alina Sklarevich
947233f539 common/verbs: added a call to ompi_ibv_free_device_list.
the ompi_common_verbs_find_ports function had a call to
ompi_ibv_get_device_list, but not to ompi_ibv_free_device_list.

fixed by Alina, reviewed by Vasily/Mike.
cmr=v1.8:reviewer=ompi-rm1.8 

This commit was SVN r31200.
2014-03-25 14:41:09 +00:00
Mike Dubman
b8dddabcfb add config section for upcoming ConnectiX4 card
cmr=v1.8:reviewer=ompi-rm1.8

This commit was SVN r31199.
2014-03-25 14:27:09 +00:00
Oscar Vega-Gisbert
cc511d0efc Avoid use Status member in Comm, Message and File.
This commit was SVN r31198.
2014-03-24 22:28:30 +00:00
Nathan Hjelm
0ed44f2fdb osc/rdma: add support for datatypes with large descriptions
This commit adds large datatype description support to the osc/rdma
component. Support is provided by an additional send/recv of the datatype
description if the description does not fit in an eager buffer. The
code is designed to require minimal new code and not for speed. We
consider this code path to be a slow path.

Refs trac:1905

cmr=v1.8:reviewer=jsquyres

This commit was SVN r31197.

The following Trac tickets were found above:
  Ticket 1905 --> https://svn.open-mpi.org/trac/ompi/ticket/1905
2014-03-24 18:57:29 +00:00
Ralph Castain
390645ac2a Per patch from Tetsuya Mishima, do a nicer job of warning the user that we need to map to a higher level to get the number of requested cpus/rank. Also, change the mapping policy to "byslot" when falling back to that option.
cmr=v1.8:reviewer=rhc

This commit was SVN r31196.
2014-03-24 15:47:29 +00:00
Ralph Castain
bd9bd2ff16 Be consistent in our handling of the "only HNP in allocation" case when setting up the VM. Thanks to Tetsuya Mishima for the suggestion.
cmr=v1.8:reviewer=rhc

This commit was SVN r31195.
2014-03-24 15:28:09 +00:00
Vasily Filipov
c424ad94f3 BTL/OPENIB: remove AC_RUN_IFELSE from configure and check AF_IB support by lib rdmacm during component_init.
This commit was SVN r31194.
2014-03-24 13:36:04 +00:00
Nathan Hjelm
15a8c9d7b8 coll/ml: addendum to r31189. increment the bcol_index
cmr=v1.8:ticket=trac:4436

This commit was SVN r31193.

The following SVN revision numbers were found above:
  r31189 --> open-mpi/ompi@c7d830f4b9

The following Trac tickets were found above:
  Ticket 4436 --> https://svn.open-mpi.org/trac/ompi/ticket/4436
2014-03-21 22:03:56 +00:00
Nathan Hjelm
128cfe0a39 coll/ml: cleanup tabs, indentation, and trailing whitespace in
bcol_basesmuma_bcast.c

This commit was SVN r31192.
2014-03-21 21:54:48 +00:00
Nathan Hjelm
d241f95af1 squash into previous. fix coll ml bcast
This commit was SVN r31191.
2014-03-21 21:54:41 +00:00
Nathan Hjelm
6740813c27 bcol/basesmuma: fix selection of coll/ml when only using local procs
When we are only using local ranks basesmuma needs to provide an allreduce
function for both large and small message or else the coll/ml selection
logic will fail. In the future this logic should probably be updated to
just disable allreduce in coll/ml instead of disabling coll/ml. For now
it should be correct to say the basesmuma allgather works for larger
messages.

cmr=v1.8:reviewer=manjugv

This commit was SVN r31190.
2014-03-21 21:54:35 +00:00
Nathan Hjelm
c7d830f4b9 coll/ml: improve the buffer size calculation and ensure the bcol_index in
a hierarchy actually matches a bcol that is in use.

There was a bug in one of the paths to calculate the ml buffer size. I fixed
the bug and squashed all the paths together to avoid further issues (the
result was correct in another path that calculated the same value).

Additionally, the i_hier was being used as the bcol_index. This is not
correct in a couple of cases so I added a variable to keep track of the
real bcol_index.

cmr=v1.8:reviewer=pasha

This commit was SVN r31189.
2014-03-21 21:54:28 +00:00
Nathan Hjelm
f1dd589092 coll/ml: there is no reason not to enable coll/ml when a process in not
bound.

This case is correctly handled by coll/ml so remove the check that diables
coll/ml in the not bound case.

cmr=v1.8:reviewer=manjugv

This commit was SVN r31188.
2014-03-21 21:54:21 +00:00
Nathan Hjelm
08bbdcbf61 coll/ml: fix leaks in coll/ml resources
This patch fixes two leaks:

 - Fix typo in fallback collective code that caused coll/ml to retain
   the ibcast module twice but only release it once. One of those ibcast
   saves was supposed to be bcast.

 - Do not check for module initialization in the module destructor. It
   is possible to destruct a module that is partially setup.

cmr=v1.8:reviewer=manjugv

This commit was SVN r31187.
2014-03-21 21:54:14 +00:00
Matthias Jurenz
c49a5d1e12 Changes to VT:
Disabled support for CUPTI API version > 4 (CUDA 6) due to API mismatch

This commit was SVN r31186.
2014-03-21 09:16:46 +00:00
Ralph Castain
2df29fd0f0 Add .mailmap to hgignore
This commit was SVN r31183.
2014-03-21 03:34:25 +00:00
Joshua Ladd
b3c6b0629c Clean up sshmem/mmap error messages. This should be added to
cmr=v1.7.5:ticket=4432

This commit was SVN r31180.

The following Trac tickets were found above:
  Ticket 4432 --> https://svn.open-mpi.org/trac/ompi/ticket/4432
2014-03-20 18:22:12 +00:00
Mike Dubman
aa4775b021 OSHMEM: better error handling for sshmem/mmap
Refs trac:4399

fixed by Roman, reviewed by Miked

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31179.

The following Trac tickets were found above:
  Ticket 4399 --> https://svn.open-mpi.org/trac/ompi/ticket/4399
2014-03-20 16:47:42 +00:00
Ralph Castain
66260615aa reverse sync the NEWS with the 1.7.5 official release
This commit was SVN r31177.
2014-03-20 16:32:33 +00:00
Jeff Squyres
e5504859d2 Follow on to r31172: improve the help message
* Show the help message for all errors, not just EINVAL
* Put the help message in the correct helpfile
* Fix grammar and spelling, and expand the help message

cmr=v1.7.5:ticket=trac:4431

This commit was SVN r31173.

The following SVN revision numbers were found above:
  r31172 --> open-mpi/ompi@3e51d28b97

The following Trac tickets were found above:
  Ticket 4431 --> https://svn.open-mpi.org/trac/ompi/ticket/4431
2014-03-20 14:51:56 +00:00
Joshua Ladd
3e51d28b97 This commit adds a help message when system limitations prevent setting up OSHMEM's symmetric heap. This fixes trac:4399 and should be added to
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31172.

The following Trac tickets were found above:
  Ticket 4399 --> https://svn.open-mpi.org/trac/ompi/ticket/4399
2014-03-20 14:42:25 +00:00
Mike Dubman
d8288fa39d OSHMEM: Fix call prepare_src with a NULL endpoint
see issue: https://svn.open-mpi.org/trac/ompi/ticket/4399

Refs trac:4399

fixed by Igor, reviewed by Alex
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31168.

The following Trac tickets were found above:
  Ticket 4399 --> https://svn.open-mpi.org/trac/ompi/ticket/4399
2014-03-20 13:11:25 +00:00
Ralph Castain
08fd24f452 Ensure we properly check for sockaddr_in before checking for tcp
This commit was SVN r31166.
2014-03-20 00:17:29 +00:00
Dave Goodell
5f3b81e291 oob: delete events when destroying a peer
Without this patch running ring_c with the usnic BTL under valgrind will
cause the orteds to segfault.

Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Ralph Castain <rhc@open-mpi.org>

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31161.
2014-03-19 22:15:49 +00:00
Jeff Squyres
412364d66e Change "test ==" to "test =" to be more portable.
This commit was SVN r31159.
2014-03-19 21:38:01 +00:00
Jeff Squyres
7710eb0423 Further README tweaks about OSHMEM.
cmr=v1.7.5:ticket=trac:4427

This commit was SVN r31157.

The following Trac tickets were found above:
  Ticket 4427 --> https://svn.open-mpi.org/trac/ompi/ticket/4427
2014-03-19 20:57:37 +00:00
Jeff Squyres
36dcddcad8 Update README regarding OSHMEM configure options and default behavior.
cmr=v1.7.5:ticket=trac:4427

This commit was SVN r31156.

The following Trac tickets were found above:
  Ticket 4427 --> https://svn.open-mpi.org/trac/ompi/ticket/4427
2014-03-19 20:50:28 +00:00
Ralph Castain
17d618abd2 As per the thread on ticket #4399, OSHMEM does not support non-Linux platforms. So provide a check for Linux and error out if --enable-oshmem is given on a non-supported platform. If no OSHMEM option is given (enable or disable), then don't attempt to build OSHMEM unless we are on a Linux platform. Default to building if we are on Linux for now, pending the outcome of the Debian situation.
cmr=v1.7.5:reviewer=jsquyres:subject=disable OSHMEM on non-Linux platforms

This commit was SVN r31155.
2014-03-19 20:32:15 +00:00
Ralph Castain
d17f811ff5 Surrender to the tyranny of C++ and give up on enum for node states, as nice as that would be, in favor of retaining memory footprint constraints.
This commit was SVN r31149.
2014-03-19 16:15:24 +00:00
Nathan Hjelm
20fe3804b0 Fix comment in r31146
cmr=v1.7.5:ticket=trac:4425

This commit was SVN r31148.

The following SVN revision numbers were found above:
  r31146 --> open-mpi/ompi@dca2f0027e

The following Trac tickets were found above:
  Ticket 4425 --> https://svn.open-mpi.org/trac/ompi/ticket/4425
2014-03-19 16:09:20 +00:00
Jeff Squyres
22e6417d9e Return non-SUCCESS error codes from attribute copy functions.
Without this, an attribute copy function could return non-success, but
it would not be propagated upwards.  This caused the intel
MPI_Keyval3_* tests to fail.

cmr=v1.8:reviewer=hjelmn

This commit was SVN r31147.
2014-03-19 15:45:38 +00:00
Nathan Hjelm
dca2f0027e Protect against 0-byte allocations in carte_create and cart_sub.
cmr=v1.7.5:reviewer=jsquyres

This commit was SVN r31146.
2014-03-19 15:38:12 +00:00
Jeff Squyres
7adb137409 Fix segv in MPI_Graph_create_undef_c Intel test.
When you call MPI_Graph_create with a old_comm of size N, and pass
nnodes=(N=1), then the Nth proc is supposed to get MPI_COMM_NULL out.
The code in this base function didn't properly handle the proc(s) that
are supposed to get MPI_COMM_NULL out.

cmr=v1.7.5:reviewer=hjelmn

This commit was SVN r31145.
2014-03-19 15:16:28 +00:00
Jeff Squyres
c6994adf66 Add missing show_help message.
Found via Cisco MTT (i.e., it complained of not being able to find
this show_help message).

cmr=v1.8:reviewer=dgoodell

This commit was SVN r31144.
2014-03-19 14:09:19 +00:00
Matthias Jurenz
cc3dd86121 Changes to VT:
Fixed compiler warning with the Clang compiler (no previous prototype for function '__fprintf_chk')

This commit was SVN r31143.
2014-03-19 13:39:26 +00:00
Mike Dubman
9314e9f0a3 OSHMEM: Fix mmap compatibility issue
Using MAP_ANONYMOUS brings different interpretation of fd, offset arguuments of mmap()
Linux:
The mapping is not backed by any file; the fd and offset arguments are ignored.
Mac:
Map anonymous memory not associated with any specific file.  The offset argument is
ignored.  Mac OS X specific: the file descriptor used for creating MAP_ANON regions
can be used to pass some Mach VM flags, and can be specified as -1 if no such flags
are associated with the region
FreeBSD:
Map anonymous memory not associated with any specific file.  The file
descriptor used for creating MAP_ANON must be -1.  The offset   argument must be 0.

fixed by Igor, reviewed by Mike

Refs trac:4399

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31140.

The following Trac tickets were found above:
  Ticket 4399 --> https://svn.open-mpi.org/trac/ompi/ticket/4399
2014-03-19 07:05:40 +00:00
Ralph Castain
f7df960198 Silence warning
This commit was SVN r31139.
2014-03-18 23:15:29 +00:00
Nathan Hjelm
e764d3bebc coll/ml: really remove the asserts in the barrier setup
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31136.
2014-03-18 22:04:50 +00:00
Jeff Squyres
3da579139b More corrections w.r.t. process groups
To accompany r31092 and r310924, also ensure to create a new process
group in the child right after the orted forks.  Add trivial configury
to ensure that we have setpgid, and only do the setpgid/getpgid if we
have setpgid.

Without this commit, killing the entire process group can do
unexpected things (e.g., kill the orted, mpirun, and even mpirun's
parent!).

cmr=v1.7.5:reviewer=rhc

This commit was SVN r31132.

The following SVN revision numbers were found above:
  r31092 --> open-mpi/ompi@99c9ecaed0

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r310924
2014-03-18 21:31:01 +00:00
Nathan Hjelm
e030443d45 coll/ml: further improve the hierarchy discovery to handle the case where a
sbgp module fails to group any processes on any nodes.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31131.
2014-03-18 21:26:24 +00:00
Nathan Hjelm
8b2d723fd4 coll/ml: fix valgrind warning about reading uninitialed value
This isn't causing any errors that I know about but it does fix an
annoying valgrind warning. Simple fix, no review required.

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31130.
2014-03-18 21:26:17 +00:00
Nathan Hjelm
d9c8bf3785 coll/ml: move error messages to verbose output
There are situations where coll/ml does not initialize properly. These will
eventually need to be fixed but in the meantime it is better to not always
print an error message because the collective framework can still fall back
on another collective module. This commit reduces the verbose output.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31129.
2014-03-18 21:26:10 +00:00
Nathan Hjelm
97d7315dd2 coll/ml: do not assert if a barrier algorithm is not available
It is usually not a good idea to assert when something is not implemented
or something goes wrong. Replace asserts with debug output and return.

cmr=v1.7.5:reviewer=manjugv

This commit was SVN r31128.
2014-03-18 21:26:04 +00:00