1
1
Граф коммитов

14503 Коммитов

Автор SHA1 Сообщение Дата
Nadia Derbey
4e71d4ae32 remove trailing colon at the end of the generated LD_LIBRARY_PATH
This commit was SVN r22677.
2010-02-22 07:47:30 +00:00
Christopher Yeoh
a14a5dc3c6 This fixes a bug where sometimes the rcache lock would be dropped when it wasn't actually held.
Also includes some minor copytight header additions that were missed in previous checkins
fixes trac:2101 added cmr:v1.4

This commit was SVN r22676.

The following Trac tickets were found above:
  Ticket 2101 --> https://svn.open-mpi.org/trac/ompi/ticket/2101
2010-02-22 07:40:42 +00:00
Christopher Yeoh
bccafbb5df Fixes the problem where the rcache and core memory allocation can deadlock itself
This commit fixes trac:2104. Request a cmr:v1.4

This commit was SVN r22675.

The following Trac tickets were found above:
  Ticket 2104 --> https://svn.open-mpi.org/trac/ompi/ticket/2104
2010-02-22 05:12:10 +00:00
Ralph Castain
65a8ab4267 Cleanup the kill_procs command. Send a SIGTERM initially to allow C/R operations, and to be polite. Correctly update proc state if there is a problem so we don't hang.
The change to just using SIGKILL was originally done due to problems whereby waitpid thought a proc had died, but it hadn't. We'll continue debugging that problem separately, but SIGTERM is required for C/R to work properly.

This commit was SVN r22674.
2010-02-21 19:35:32 +00:00
Shiqing Fan
fa6a050b80 Set the correct install source path.
This commit was SVN r22673.
2010-02-20 13:40:15 +00:00
Shiqing Fan
e0bfd9f836 A type cast.
This commit was SVN r22672.
2010-02-20 10:47:37 +00:00
Edgar Gabriel
61dee816db This commit fixes a bug on how to deal with the potential if a 'dependent'
communicator that we created has a lower CID than the parent comm. This can
happen when using the hierarch collective communication module or for
inter-communicators (since we make a duplicate of the original communicator).
This is not a problem as long as the user calls MPI_Comm_free on the parent 
communicator.  However, if the communicators are not freed by the user but
released by Open MPI in MPI_Finalize, we walk through the list of still
available communicators and free them one by one. Thus, local_comm is freed
before the actual inter-communicator. However, the local_comm pointer in the
inter communicator will still contain the 'previous' address of the local_comm
and thus this will lead to a segmentation violation. In order to prevent that
from happening, we increase the reference counter local_comm by one if its CID
is lower than the parent. We cannot increase however its reference counter if
the CID of local_comm is larger than the CID of the inter communicators, since
a regular MPI_Comm_free would leave in that the case the local_comm hanging
around and thus we would not recycle CID's properly, which was the reason and
the cause for this trouble.

This commit fixes tickets 2094 and 2166. Note however, that I want to close
them manually, since a slightly different patch is required for the 1.4
series. This commit will have to be applied for the 1.5 series. And I will
need a volunteer to review it.

This commit was SVN r22671.
2010-02-19 23:45:30 +00:00
Rainer Keller
548d6f7c61 - Incorporated a rewording proposal by Jeff.
This commit was SVN r22670.
2010-02-19 14:37:09 +00:00
George Bosilca
7eff2cdf85 Unrestricted number of interfaces.
This commit was SVN r22669.
2010-02-19 07:10:32 +00:00
George Bosilca
3356c2e241 Don't forget to update the return value for PPC32 and PPC64.
This commit was SVN r22665.
2010-02-18 19:16:41 +00:00
George Bosilca
ab202d0f69 Add the memory and the cc to the clobber list for the cas atomics.
This commit was SVN r22664.
2010-02-18 19:15:50 +00:00
Ethan Mallove
52f0f75a28 In case config/missing gets invoked, ensure that all the OMPI-specific m4
macros are defined.

This commit was SVN r22663.
2010-02-18 18:11:23 +00:00
Rainer Keller
8dd87def77 - Keep only the _LAST_ entry when reading in output from mount:
On Jaguar / is NFS-mounted over the initially mounted ROOTFS...

This commit was SVN r22662.
2010-02-18 18:05:55 +00:00
Jeff Squyres
6c2c68907c If there is a .hgignore_local file, also add that to the end of the
.hgignore file.  This is helpful for svn+hg combo trees.

This commit was SVN r22660.
2010-02-18 14:53:00 +00:00
Matthias Jurenz
111a424dac - removed hard-coded directory paths in vt_dyninst.c
- temporary disabled wrapper for 'fcntl' in vt_iowrap.c, due to curious behaviour on some platforms (e.g. segfault)

This commit was SVN r22659.
2010-02-18 10:36:20 +00:00
Pavel Shamis
a124f6b10b Adding a hash table for management dependences between SRQs and their BTL modules.
This commit was SVN r22653.
2010-02-18 09:48:16 +00:00
Ralph Castain
2be03b4fb6 Cleanup a few bugs in the rmcast subsystem
This commit was SVN r22650.
2010-02-18 01:54:45 +00:00
Rainer Keller
a46cecf4f2 - Use strrchr instead of loop for '/' as Nysal suggests.
This commit was SVN r22649.
2010-02-17 23:40:08 +00:00
Ralph Castain
40be3d896c Ensure we set an error code when leaving, correctly check for slot_list_set return status
This commit was SVN r22643.
2010-02-17 22:59:19 +00:00
Jeff Squyres
c23e6f3d56 Add an opal_attribute_unused in here since we're no longer using this
parameter (I just discovered while researching for v1.4 that v1.4 has
effectively this same function definition: it just always returns
true!).

This commit was SVN r22642.
2010-02-17 21:12:49 +00:00
Jeff Squyres
898eedd78f Fixes trac:2233.
This commit adds a lengthy comment in ompi_datatype.h that explains
why a one-sided datatype check was removed.  The short version is that
we do have to allow some datatypes that may be unwise to use (e.g.,
"h" types of datatypes that have offsets in bytes -- MPI says it's ok
to use these), and our DDT engine can't currently detect datatypes
with absolute offsets, which MPI says it's ''not'' ok to use with
one-sided operations.  Hence, we don't check for some datatypes that
are invalid to use with one-sided operations, and erroneous programs
may crash and burn.  Life is hard.

The main point of this commit is that we now do allow datatypes for
one-sided operations that are supposed to be allowed.

This commit was SVN r22641.

The following Trac tickets were found above:
  Ticket 2233 --> https://svn.open-mpi.org/trac/ompi/ticket/2233
2010-02-17 20:16:55 +00:00
Jeff Squyres
17f0885f12 Add proper BSD interface detection code. Fixes a long-standing
discussion on the users list (see
http://www.open-mpi.org/community/lists/users/2009/12/11526.php). 

Many thanks to Kevin Buckley who did most of the coding work, and to
Aleksej Saushev for his extreme patience in waiting for me to review
and commit this stuff.

This commit was SVN r22640.
2010-02-17 19:43:57 +00:00
George Bosilca
3bceb20b1c Only get the receive datatype extent on the root process, as every
other process should ignore this value. Thanks to Michael Hofmann
for investigating this issue.

This commit closes trac:2268.

This commit was SVN r22639.

The following Trac tickets were found above:
  Ticket 2268 --> https://svn.open-mpi.org/trac/ompi/ticket/2268
2010-02-17 16:01:50 +00:00
Terry Dontje
2a4b1227d9 corrected an array access bug in the latest libevent merge (see #2234) that was causing Solaris binaries to loop infinitely.
This commit was SVN r22638.
2010-02-17 14:50:37 +00:00
Matthias Jurenz
1ce37bc5ce VT general:
- Updated date in copyright header of each source file
VT configure fixes:
- fixed configure's version detection for PAPI to support version 4.x
- added configure tests to detect Bull MPICH2
VT new features:
- added support for "re-locate" an existing VampirTrace? installation without re-build it from source (fixes OMPI's ticket #1990)
- added support for tracing functions in shared libraries instrumented by the GNU, Intel, Pathscale, ot PGI 9 compiler
- added support for PAPI-C counters which belong to different components
- extended usability of environment variable VT_METRICS for PAPI counters to specifiy whether a counter provides increasing or absolute values

This commit was SVN r22637.
2010-02-17 14:38:11 +00:00
Shiqing Fan
3a3018deef Convert the line endings for the added header files. They were changed automatically by Windows when adding new files.
This commit was SVN r22634.
2010-02-16 17:24:44 +00:00
Rainer Keller
4ded1651a3 - Don't we love Fortran: while everybody was of the opinion the
continuation mark in column 6 is enough to split the lines, we do
   the same continuation mark in column 73.
   Now, while that would fit any msg., this would produce warnings when
   including mpif-config.h  with -Wall in gfortran and -warn in ifort.

   Just get the SVN Version string short and forget it. Let's see
   make check choke on that.

   This additionally fixes the HG version string...

   (Will mention this in ticket #2259)

This commit was SVN r22620.
2010-02-16 02:25:00 +00:00
Shiqing Fan
08ffdbe987 Changes for portable platform headers. Commit it on behalf of Ralph.
This commit was SVN r22619.
2010-02-15 22:14:59 +00:00
Shiqing Fan
0b765637d9 A type cast.
This commit was SVN r22618.
2010-02-15 10:26:02 +00:00
Pavel Shamis
9d0ae097c1 Updating vendor part ids for some mellanox devices
This commit was SVN r22617.
2010-02-15 09:45:34 +00:00
Ralph Castain
9a5fdbb622 Continue development of reliable multicast
This commit was SVN r22616.
2010-02-14 19:20:56 +00:00
Ralph Castain
7a1b2a706e Add a new ring_buffer class
This commit was SVN r22615.
2010-02-14 19:20:19 +00:00
Ralph Castain
58a1151566 Ensure the man page gets into the tarball
This commit was SVN r22613.
2010-02-13 02:39:10 +00:00
Nysal Jan
0538b1a948 Adding GPFS to the list of file systems checked
This commit was SVN r22612.
2010-02-12 14:15:39 +00:00
Jeff Squyres
dafc0c914b Restoring the build for now.
This commit was SVN r22611.
2010-02-12 12:03:17 +00:00
Shiqing Fan
3ab892ad8a Add exclude pattern for installing directory.
This commit was SVN r22610.
2010-02-12 10:24:22 +00:00
Rainer Keller
48254c78c9 - When svn version string becomes too long (>72 columns) some Fortran
compilers get confused. Continue on the next line.
   Thanks to Richard Tran Mills for noticing that.

This commit was SVN r22609.
2010-02-11 23:23:36 +00:00
Ralph Castain
dc6de3e9b5 Add an "orte-info" tool to report ORTE and OPAL configuration and parameter info ala "ompi_info". Temporary solution until refactoring of the "info" system can be undertaken.
This commit was SVN r22608.
2010-02-11 23:02:14 +00:00
Rainer Keller
ecbd530a77 - Well well, that's what one gets when turning on all kinds of old
tests ;-)) Turn them off again, didn't have time to look into them
   Also, the test-program on eddie.osl.iu.edu, detects the rpc_pipefs
   mounted on /var/lib/nfs/rpc_pipefs, required for NFS.

This commit was SVN r22607.
2010-02-11 22:07:07 +00:00
Jeff Squyres
6c5f666890 Add a comment to the loopback check to explain why it is there. Also
slightly correct one other comment.

This commit was SVN r22606.
2010-02-11 14:59:04 +00:00
Josh Hursey
ec4498c258 update copyright
This commit was SVN r22605.
2010-02-11 14:03:36 +00:00
Rainer Keller
ea4de16561 - Check whether file is opened on network file-system.
If file does not exist, check the directory it lives in...
   Maybe used by caller, trying to open mmap() on NFS, Lustre or
   Panasas (thanks Sam).
   For now, this is used to warn about the usage of mmap on such FS.

   Please note, that Ralph mentioned the orte_no_session_dir parameter.
   The help message includes a reference to this.

   Tested on NFS and Lustre on Linux on
     smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
     jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...

   Fixes trac:1354

   This should   cmr:v1.5   once it has soaked and is shown to work on
   Solaris

This commit was SVN r22604.

The following Trac tickets were found above:
  Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
2010-02-10 23:18:29 +00:00
Jeff Squyres
ee8b9e9c3a * Somehow we got an extra copy of the 1.4.1 bullets in here
* Added note about --report-bindings and mpi_paffinity_alone fixes

This commit was SVN r22603.
2010-02-10 23:04:45 +00:00
Jeff Squyres
a89dc623b0 Brice Goglin noticed that mpi_paffinity_alone didn't seem to be doing
anything for non-MPI apps.  Oops!  (But before you freak out, gentle
reader, note that mpi_paffinity_alone for MPI apps still worked fine)
When we made the switchover somewhere in the 1.3 series to have the
orted's do processor binding, then stuff like:

  mpirun --mca mpi_paffinity_alone 1 hostname

should have bound hostname to processor 0.  But it didn't because of a
subtle startup ordering issue: the MCA param registration for
opal_paffinity_alone was in the paffinity base (vs. being in
opal/runtime/opal_params.c), but it didn't actually get registered
until after the global variable opal_paffinity_alone was checked to
see if we wanted old-style affinity bindings.  Oops.

However, for MPI apps, even though the orted didn't do the binding,
ompi_mpi_init() would notice that opal_paffinity_alone was set, yet
the process didn't seem to be bound.  So the MPI process would bind
itself (this was done to support the running-without-orteds
scenarios).  Hence, MPI apps still obeyed mpi_paffinity_alone
semantics.

But note that the error described above caused the new mpirun switch
--report-bindings to not work with mpi_paffinity_alone=1, meaning that
the orted would not report the bindings when mpi_paffinity_alone was
set to 1 (it ''did'' correctly report bindings if you used
--bind-to-core or one of the other binding options).

This commit separates out the paffinity base MCA param registration
into a small function that can be called at the Right place during the
startup sequence.

This commit was SVN r22602.
2010-02-10 22:32:00 +00:00
Rainer Keller
583bb42739 - Adapt for changed opal_init() arguments -- takes argc&argv
It's orte/constants.h not orte/orte_constants.h

This commit was SVN r22594.
2010-02-10 18:29:01 +00:00
Rainer Keller
c161cf5fa4 - These orte tests refer to include files not available anymore, call
functions not in the orte-tree, so disable for now.

This commit was SVN r22593.
2010-02-10 18:21:04 +00:00
Jeff Squyres
8f7edf6e3e After a '''lot''' of discussion and testing, this commit fixes some
long-standing bugs (see trac ticket list below).  They're currently
somewhat obscure bugs, but are becoming much more relevant in a world
where OpenFabrics devices fail and you replace them with a newer model
(i.e., the cluster is homogeneous... ''except'' for where you had to
replace one or two OpenFabrics devices, and the same model is no
longer available).

This commit includes a '''lengthy''' comment (that we spent a lot of
time writing!) about what exactly it does and does not do.  The
previous code was rather short and '''incredibly''' subtle.  The new
code is slightly longer, but is both much more explicit and much more
painstakingly documented.

This commit fixes multiple trac tickets.  The real one that we fix is
#1707; the others are fixed as a side-effect.  In short: fixing #1707
prevents Bad Things from happening later in the startup sequence.

Fixes trac:1707, #2164, #1574.

cmr:v1.4.2:reviewer=pasha
cmr:v1.5:reviewer=pasha

This commit was SVN r22592.

The following Trac tickets were found above:
  Ticket 1707 --> https://svn.open-mpi.org/trac/ompi/ticket/1707
2010-02-10 16:53:26 +00:00
Rainer Keller
3ca8adb540 - The only differences in the underlying header file between
GASNet-1.12.0  and  GASNet-1.14.0.

This commit was SVN r22591.
2010-02-10 14:07:44 +00:00
Nysal Jan
97d66bce78 This fixes trac:2154 - CSUM PML false positive. Needs to go to both cmr:v1.4.2 and cmr:v1.5
This commit was SVN r22590.

The following Trac tickets were found above:
  Ticket 2154 --> https://svn.open-mpi.org/trac/ompi/ticket/2154
2010-02-10 10:24:16 +00:00
Steve Wise
d40d2165c0 Never advertise a loopback address (127/8) to your peers.
This commit was SVN r22589.
2010-02-09 19:07:33 +00:00