1
1
Граф коммитов

6407 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
e3d0782788 Move the assignment after the bozo check.
This commit was SVN r28669.
2013-06-22 12:38:32 +00:00
Rolf vandeVaart
5ebb74bee3 Fix case where amount of data sent is less than expected. Otherwise, we will get hang when running the RGET protocol.
Reviewed by hjelm,bosilca.

This commit was SVN r28667.
2013-06-21 18:35:16 +00:00
Joshua Ladd
0b5c1f2ea8 Add 'generic' support for PMI2 (previously, we checked for PMI2 only on Cray systems.) If your resource manager (e.g. SLURM) has support for PMI2, then the --with-pmi configure flag will enable its usage. If you don't have PMI2, then you will fallback to regular old PMI1. This patch was submitted by Ralph Castain and reviewed and pushed by Josh Ladd. This should be added to cmr:v1.7:reviewer=jladd
This commit was SVN r28666.
2013-06-21 15:28:14 +00:00
Jeff Squyres
2e5c18195b We want to ignore this MPI extension in the general case -- it's just
an example (and outputs stuff to stdout!).

This commit was SVN r28654.
2013-06-19 16:01:45 +00:00
Mike Dubman
d1c82994be fix: detect threading model to take appropriate flow in mxm
This commit was SVN r28648.
2013-06-16 08:40:06 +00:00
Jeff Squyres
a0b27f5b28 Better comment than what was submitted in r28614.
This commit was SVN r28631.

The following SVN revision numbers were found above:
  r28614 --> open-mpi/ompi@9556310bd0
2013-06-13 20:52:44 +00:00
Matthias Jurenz
ebf441ba4b Changes to VT: Fixed infinite recursion bug if the verbosity level (env. VT_VERBOSE) is higher or equal to 2
This commit was SVN r28624.
2013-06-13 07:33:22 +00:00
Jeff Squyres
34fb0712c4 Per https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/256, we need
to set *flag=1 when source == MPI_PROC_NULL.

cmr:v1.7.2:reviewer=dgoodell
cmr:v1.6.5:reviewer=dgoodell

This commit was SVN r28621.
2013-06-12 21:38:07 +00:00
Nathan Hjelm
db9bce0926 Add destructors for MPI_T error codes
This commit was SVN r28618.
2013-06-12 14:58:14 +00:00
Matthias Jurenz
ba9bc238ee attempt to fix #3627: Pass all configure options from the OMPI top-level configure to the OTF sub-configure
This commit was SVN r28616.
2013-06-12 13:39:21 +00:00
Mike Dubman
9556310bd0 cosmetic: add comment with rationale for malloc.h include
This commit was SVN r28614.
2013-06-12 05:58:32 +00:00
Nathan Hjelm
9b1f32bf12 BTL: add flags for signaled BTL operations
As per discussion in the June 2013 developer meeting these
flags will be used by the PML in the future to request
asynchronous progress on an operation. The naming was chosen
to reflect that a BTL supports this mode (MCA_BTL_FLAG_SIGNALED)
and that a descriptor should "signal" the remote side to wake
up and progress the message (MCA_BTL_DES_FLAG_SIGNAL).

Future commits will update OB1 to take advantage of this
feature when performing the RDMA get or RDMA rendezvous
protocols.

This commit was SVN r28612.
2013-06-11 21:52:20 +00:00
Jeff Squyres
bf7f9b1f41 Fix minor typo in man page.
This commit was SVN r28606.
2013-06-10 13:44:48 +00:00
Mike Dubman
d18b3ae1a7 fix malloc deprication error with gcc 4.6.3 on ubuntu/fedora
This commit was SVN r28605.
2013-06-09 18:13:16 +00:00
George Bosilca
d789423d34 Typo.
This commit was SVN r28603.
2013-06-08 10:44:02 +00:00
Vishwanath Venkatesan
0b727f84da Avoid malloc of zero bytes, add a check and avoid it.
This commit was SVN r28597.
2013-06-06 14:08:57 +00:00
Edgar Gabriel
2d4655a05a Logic has been revised compared to the previous implementation.
This commit was SVN r28594.
2013-06-05 23:47:42 +00:00
Edgar Gabriel
03c1db7a3a fix the calculation of the UNIFORM flag.
This commit was SVN r28593.
2013-06-05 23:18:50 +00:00
Vishwanath Venkatesan
7d6a05982a Removing the gather_array based on the flag UNIFORM FVIEW for read all operations (dynamic/static),
+ Disabling Timing data extraction by default in dynamic write all

This commit was SVN r28592.
2013-06-05 21:35:37 +00:00
Vishwanath Venkatesan
55878674d7 1. Removing the allgather_array based on the flag UNIFORM FVIEW. This is not really and optimization.
2. Fixing some of the debug printf's these are outdated.

This commit was SVN r28591.
2013-06-05 21:30:15 +00:00
Jeff Squyres
713e3aa3db Refs trac:3626: that ticket specifically refers to the v1.6 branch; this
commit is the trunk version of what is needed for #3626.

Add the "ignore_device" field to the INI file.  This allows us to
specifically list devices that should be ignored by the openib BTL
(such as the Intel Phi, at least as of May 2013 -- see #3626).  

Also add the Intel Phi to the ini file, and set its ignore_device=1.

Finally, add the concept of counting intentionally ignored verbs
devices.  Devices are ignored for one of two reasons:

 * If the number of allowed ports on that device is 0 (i.e., if
   if_include/if_exclude was set such that we're intentionally
   ignoring this device).
 * If the INI ignore_device field for this device is set to 1.

Once we have the count of devices that were intentionally ignored,
only show the "Hey, there's verbs devices that you're not using!"
show_help message if there are devices that were ''unintentionally''
ignored.

This commit was SVN r28589.

The following Trac tickets were found above:
  Ticket 3626 --> https://svn.open-mpi.org/trac/ompi/ticket/3626
2013-06-05 12:12:09 +00:00
Jeff Squyres
3019b7a3f8 Oops! Remove duplicate registration.
This commit was SVN r28588.
2013-06-05 11:55:19 +00:00
Jeff Squyres
1de00b17ad Properly check the return status from registering the MCA params.
This commit was SVN r28587.
2013-06-05 11:53:18 +00:00
Nathan Hjelm
e48bd9809e Add useful messages for MPI_T error codes
This commit was SVN r28584.
2013-06-04 23:18:44 +00:00
Jeff Squyres
d692aba672 Remove the DR PML. It was abondoned long ago. It had a nice life,
a few papers, and now a decent demise with respect.  

This commit was SVN r28582.
2013-06-04 19:36:16 +00:00
Jeff Squyres
d1dc4da292 Fix typo (the debugger might not be TotalView).
This commit was SVN r28577.
2013-05-31 00:39:05 +00:00
Edgar Gabriel
87b3782b7f arghh, copy-and-paste error, status->_ucount has to be set to 0 not max_data for count=0.
This commit was SVN r28576.
2013-05-30 22:00:29 +00:00
Edgar Gabriel
9daec82f17 - make a fileview of 0 bytes work in ompio
- fixes the bug reported in ticket 3619 (which is already closed) also for ompio

This commit was SVN r28575.
2013-05-30 21:33:13 +00:00
Rolf vandeVaart
3d1d158a80 Do not abort in BTL. Rather, callback into PML error function. Thanks George for review.
This commit was SVN r28559.
2013-05-23 18:45:23 +00:00
George Bosilca
a9aae9c538 Patch based on Takahiro Kawashima fixing the issues with some
of the Fortran datatypes. This patch prevent the copy of the
datatype description from the OPAL to the OMPI layer in order
to decrease the memory requirements.

This commit was SVN r28553.
2013-05-22 18:35:21 +00:00
Jeff Squyres
43a534a5c6 Revert r28544: the original code was fine.
This commit was SVN r28549.

The following SVN revision numbers were found above:
  r28544 --> open-mpi/ompi@c830d96673
2013-05-21 16:06:08 +00:00
Jeff Squyres
c830d96673 Silence compiler warnings, as suggested by Alan Sayre.
This commit was SVN r28544.
2013-05-21 13:42:18 +00:00
Nathan Hjelm
721779d7ab Per RFC: remove old MCA parameter system.
This commit was SVN r28541.
2013-05-20 15:36:13 +00:00
Ralph Castain
889bf60c64 Fix bad merge
This commit was SVN r28540.
2013-05-18 01:29:55 +00:00
Jeff Squyres
089c632cce Remove a bunch of dead code: gcc 4.7 warns of set-but-unused
variables.  So get rid of them.

This commit was SVN r28538.
2013-05-17 21:45:49 +00:00
Edgar Gabriel
1b1051da6c fix a bug in the calculation of the explicit offset. Use the opportunity to
clean up the code a bit.

This commit was SVN r28537.
2013-05-17 20:22:00 +00:00
George Bosilca
4d9f30fb05 Fix issue identified by Takahiro Kawashima regarding the overwriting
of the OPAL datatype descriptions upon MPI_Init. Now each layer (OPAL
and OMPI) uses it's own descriptions for the predefined datatypes,
thus preventing over-writing of the other layer data description.

This commit was SVN r28535.
2013-05-17 13:09:16 +00:00
Ralph Castain
3e6e1046a3 fix a correctness issue by returning an error if waitall fails and invoking the mpi error handler
cmr:v1.7.2:reviewer=jsquyres

This commit was SVN r28533.
2013-05-16 15:04:37 +00:00
Matthias Jurenz
ef0a080028 Changes to VT:
- fixed compiler warnings when compiling for 32-bit
	- MPI wrapper generator scripts:
		- removed non-posix call to length(array)
		- exit scripts if any statement returns a non-true return value (set -e)

This commit was SVN r28524.
2013-05-15 10:44:51 +00:00
Rolf vandeVaart
91fdb423d7 Fix warning in CUDA-aware code.
This commit was SVN r28511.
2013-05-14 21:04:15 +00:00
Rolf vandeVaart
52ebb0b17f Change some opal_output to OPAL_OUTPUT per CMR review.
This commit was SVN r28510.
2013-05-14 20:49:42 +00:00
Nathan Hjelm
32a8ff5255 btl/openib: bump up udcm priority
This commit was SVN r28505.
2013-05-14 20:02:40 +00:00
Edgar Gabriel
d5cae9aced - fix the mca stripe size and stripe depth parameter logic in the pvfs2
component
- correctly recognize and handle the corresponding info objects.

This commit was SVN r28497.
2013-05-14 16:11:39 +00:00
Matthias Jurenz
9a0432632a Changes to VT:
- general:
		- incremented version number to 5.14.4
		- fixed Coverity CIDs: 72002, 72099, 72273, 710580, 710664, 710665, 710666
	- VT libs:
		- fixed "incompatible declaration" errors when building against an MPI-3 implementation
		  Since MPI-3 the C keyword "const" is added to all relevant MPI API parameters
		  (e.g. MPI_Send(void* sendbuf, ...) -> MPI_Send(const void* sendbuf, ...)).
		  Prepending the macro CONST to these parameters which is defined either to "const" (if MPI-3) or to nothing (if MPI-1/2).
		- fixed potential buffer overflow when reading the filter file
		- CUDA tracing:
			- enabled access to CUPTI counters for CUDA tracing via CUPTI
			- enabled GPU memory usage tracing independent of the CUDA API
			- enabled recording of CUDA synchronization and implicit synchronization in blocking CUDA memory copies for CUDA tracing via CUPTI
			- enabled recording of synchronous peer-to-peer CUDA memory copies for CUDA tracing via CUPTI
			- consider CUDA data transfers as not idle for option 'pure_idle'
			- fixed identification of the CUDA device ID for CUDA tracing via CUPTI
			- fixed region filtering for applications using the CUDA runtime API wrapper
	- compiler wrappers: add path to mpi.h to the PDT parser command and preprocessor flags

This commit was SVN r28494.
2013-05-14 14:28:04 +00:00
Yossi Etigin
64d98e0438 Fix data corruption in MXM by registering to OPAL memory release hooks and removing any mappings created by mxm
This commit was SVN r28489.
2013-05-14 12:27:44 +00:00
Jeff Squyres
4d9da92e60 Fixes trac:376: bu default the wrappr compilers will enable rpath support
in generated executables on systems that support it.  Use
--disable-wrapper-rpath to disable this behavior.  See text in
README about --disable-wrapper-rpath for more details.

This commit was SVN r28479.

The following Trac tickets were found above:
  Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
2013-05-11 00:49:17 +00:00
Rolf vandeVaart
9d569f1487 Fix warning when compiling in CUDA aware code.
This commit was SVN r28476.
2013-05-10 21:29:08 +00:00
Ralph Castain
ae68a953f4 Sigh - one more place
This commit was SVN r28447.
2013-05-05 00:25:14 +00:00
Nathan Hjelm
422331b4da btl/openib: fix unconnected datagram connection method (udcm)
The primary issue with udcm is that the immediate data in message
acks were often bogus. This caused the sender to keep trying even
though a message was received and acked. The fix is to use the
source LID and QP to determine which message is being acked. In
most cases this should work well since only one message will be
in flight to any peer.

This commit was SVN r28444.
2013-05-03 17:11:38 +00:00
Jeff Squyres
c8258c06e2 In coll_sm, we alloc a huge chunk of shared memory, divvy it into lots
of individual regions (each region is a multiple of page size in
length), and each process claims its own regions by binding it to its
local memory.  Each process would end up membining something like 16
individual regions in the overall shmem segment.

There were two errors in this code relating to the memory affinity
pinning.  Some combination of these two errors would lead to kernel
panics (!) on my RHEL 6.2 x86_64 machines when used with mmap'ed
shared memory (not posix or sysv shared memory, curiously enough):

1. The shared memory segment is initially divided into two regions:
control and data.  The control starts at the beginning of the shmem
segment, the data starts after that.  The data portion, unfortunately,
was ''not'' aligned to a page.  So all the multiple-of-page-size
regions that we divvy up were also not alined on page boundaries.  And
therefore all the regions we tried to membind were not on page
boundaries.

The solution was to ensure that the data portion started on a page
boundary.  Then all of the individual regions were on page boundaries,
too.

That being said, in my tests, Linux mbind() fails gracefully when the
address is not on a page boundary.  So I'm not sure how this worked at
all / led to a kernel panic...

2. There was some bad pointer math that resulted in membinding regions
larger than they should have been, resulting in region overlaps.
There were definitely overlaps between regions in the same process;
it's likely that there were overlaps between regions of multiple
processes, too -- I'm not sure (and don't care to figure out :-) ).

The solution was to fix the pointer math so that each region membinds
exactly only itself and no neighboring/overlapping regions.

cmr:v1.7.2:reviewer=samuel

This commit was SVN r28442.
2013-05-03 12:49:35 +00:00