1
1
Граф коммитов

12336 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
b5123cb79f Don't destroy the event channel until after everything else has been
torn down.  Fixes trac:1582.

This commit was SVN r19800.

The following Trac tickets were found above:
  Ticket 1582 --> https://svn.open-mpi.org/trac/ompi/ticket/1582
2008-10-24 15:04:54 +00:00
Jeff Squyres
ae34fd150a It always helps to initialize a variable before you try to use it
(vs. only initializing it in some cases).

This commit was SVN r19796.
2008-10-24 14:20:07 +00:00
Jeff Squyres
d96b78fee1 If the script is there, there's no real reason to have these files in
the repo.

This commit was SVN r19795.
2008-10-24 13:42:26 +00:00
Jeff Squyres
0a741d7f81 Add scripty-foo to make the data files. Revamp the data files to be
non-uniform in content as a slightly better test.

This commit was SVN r19794.
2008-10-24 13:35:47 +00:00
Ralph Castain
c56cdac379 Finish cleanup of stdin. Set non-stdio file descriptors to non-blocking (thanks to Jeff for catching that one). Handle writes that result in "would have blocked" errno.
This commit was SVN r19793.
2008-10-24 01:42:58 +00:00
Ralph Castain
6100d88ded Cleanup the new IOF:
1. remove some stale files that were overlooked in original commit

2. add a test program and data to stress iof for stdin

3. cleanup a debug statement that caused memory corruption when reading large files

4. some minor cleanups to correctly handle xon/xoff scenarios

This commit was SVN r19792.
2008-10-23 19:11:05 +00:00
Jeff Squyres
238a6f851f Wow. And I won't name who it was who did this. You know who you are! :-)
We had a public symbol named "already_opened".  This commit changes
the name to mca_btl_base_already_opened.

I guess it's good we have the visibility stuff enabled by default!
:-)

This commit was SVN r19791.
2008-10-23 18:56:10 +00:00
George Bosilca
9260f6b157 There is no reason to ask for an ACK from the BTL here.
This commit was SVN r19789.
2008-10-22 20:13:33 +00:00
George Bosilca
7dfdf3e907 A lot of MX fixes.
1. Allow MX bonding via btl_mx_bonding MCA parameter. With this on, Open MPI
  suppose that lib MX will do the bonding, and we will only return one BTL.
  Otherwise, we return as many as devices.
2. Decrease the memory footprint, by cleaning up what we store about the
  peers and how we store it.
3. Allow multiple MX routes that share the same mapper. In this particular
  case we will link by their nic_id.
4. Allow multiple MX routes with multiple mappers. In this case we match
  the NICs based on the last 6 digits of the mapper MAC.
5. Increase the size of the eager and rendez-vous eager limits in the
  case where we are unable to register an unexpected callback with MX.
6. Increase the default max number of MX fragments.
7. Increase the max number of MX BTLs.
8. Only allow mx_if_include and mx_if_exclude if we have acess to the
  mapper.

This commit was SVN r19788.
2008-10-22 20:12:30 +00:00
Terry Dontje
a39d8d62e7 Correct the "default" message for --enable-cxx-exceptions to disabled as the
code is actually coded and correct some F90 comments to f77.

This commit was SVN r19784.
2008-10-21 20:54:08 +00:00
Jeff Squyres
ac698173b9 * Ensure to pass the C++ exceptions flags to the C and Fortran
compilers as well.   Not doing this was causing problems with
   MPI::ERRORS_THROW_EXCEPTIONS with gcc in 32 bit (but not 64 bit!). 
 * Ensure that the C and Fortran compilers actually like the C++
   exceptions flags.  If not, currently just abort.  Let's see if
   anyone complains about this -- I doubt they will because a) C++
   exception support is not enabled by default, and b) I think C++
   exceptions really only make sense within the same compiler family.

This commit was SVN r19783.
2008-10-21 20:25:20 +00:00
Tim Mattox
56c014a3a2 Resync the NEWS with the 1.2 branch.
This commit was SVN r19780.
2008-10-21 18:30:19 +00:00
George Bosilca
83474b2e1a Solve a modulus rounding error. As the modulus can be signed (in C89 it take
the sign of the divident), we have to cast the pointer to an uintptr_t in
order to be able to correctly compute how to align it on the cache line.
Rported and solved by Stephan Kramer. Thanks Stephan.

This commit was SVN r19778.
2008-10-21 17:00:39 +00:00
George Bosilca
61317cb61d Complete the r19767 commit for XGrid, i.e. allow the PLM Xgrid to build.
This commit was SVN r19777.

The following SVN revision numbers were found above:
  r19767 --> open-mpi/ompi@6e5d844c36
2008-10-21 15:37:22 +00:00
Jeff Squyres
f2a7993aa5 Refs trac:1578: Shiqing-suggested changes for valgrind configure.m4 support.
This commit was SVN r19776.

The following Trac tickets were found above:
  Ticket 1578 --> https://svn.open-mpi.org/trac/ompi/ticket/1578
2008-10-21 03:27:43 +00:00
Tim Mattox
07eda5d696 Resync the NEWS file with the 1.2 branch.
This commit was SVN r19774.
2008-10-20 15:33:39 +00:00
Ralph Castain
1cfa0c26a1 Add non-debug platform for mac
This commit was SVN r19772.
2008-10-18 13:10:05 +00:00
Ralph Castain
ebaa2c59bb Cleanup non-debug builds
This commit was SVN r19771.
2008-10-18 13:09:47 +00:00
Jeff Squyres
6d026b86b7 Fix a problem reported on the user list by Teng Lin: OPAL_PREFIX
wasn't exported in the Bourne-shell-flavor case on remote nodes.

This commit was SVN r19770.
2008-10-18 12:13:10 +00:00
Jeff Squyres
d96003fec5 Fix typo.
This commit was SVN r19769.
2008-10-18 11:52:41 +00:00
Jeff Squyres
8ea27c0ced Add a missing header file to the Makefile.am so that it can be
included in the distribution tarball.

This commit was SVN r19768.
2008-10-18 11:09:57 +00:00
Ralph Castain
6e5d844c36 Roll in the revamped IOF subsystem. Per the devel mailing list email, this is a complete rewrite of the iof framework designed to simplify the code for maintainability, and to support features we had planned to do, but were too difficult to implement in the old code. Specifically, the new code:
1. completely and cleanly separates responsibilities between the HNP, orted, and tool components.

2. removes all wireup messaging during launch and shutdown.

3. maintains flow control for stdin to avoid large-scale consumption of memory by orteds when large input files are forwarded. This is done using an xon/xoff protocol.

4. enables specification of stdin recipients on the mpirun cmd line. Allowed options include rank, "all", or "none". Default is rank 0.

5. creates a new MPI_Info key "ompi_stdin_target" that supports the above options for child jobs. Default is "none".

6. adds a new tool "orte-iof" that can connect to a running mpirun and display the output. Cmd line options allow selection of any combination of stdout, stderr, and stddiag. Default is stdout.

7. adds a new mpirun and orte-iof cmd line option "tag-output" that will tag each line of output with process name and stream ident. For example, "[1,0]<stdout>this is output"

This is not intended for the 1.3 release as it is a major change requiring considerable soak time.

This commit was SVN r19767.
2008-10-18 00:00:49 +00:00
Ralph Castain
4858c9b43c Revised platform files
This commit was SVN r19766.
2008-10-17 23:52:00 +00:00
Jeff Squyres
bccd8a6cfc Expand a few configure help string messages, mainly because Terry was
complaining that he didn't know what the defaults were ;-), but also
because it's the Right thing to do.

This commit was SVN r19765.
2008-10-17 23:05:30 +00:00
Jeff Squyres
e42139710b A typo prevented the valgrind memchecker component finding the
Valgrind header files if they weren't already in the compiler's
default header file search path.  This commit fixes that typo and adds
a little more infrastructure (via an AC_SUBST) to pass in the relevant
CPPFLAGS to the build system for the valgrind memchecker component.

This commit was SVN r19764.
2008-10-17 23:04:39 +00:00
Jeff Squyres
307d52aedc Ensure to check for internal ptmalloc2 support properly.
This commit was SVN r19760.
2008-10-17 16:27:20 +00:00
Jeff Squyres
278ad5f867 Update svn:ignore
This commit was SVN r19759.
2008-10-17 16:26:52 +00:00
Jeff Squyres
e34c93c46a Fix problem of missing ) noted by Mostyn Lewis.
This commit was SVN r19758.
2008-10-17 16:03:17 +00:00
Rich Graham
9d59c0fbd6 add comment on what ompi_convertor_need_buffers() does.
This commit was SVN r19757.
2008-10-16 15:40:21 +00:00
Josh Hursey
88aa45dd52 Commit to bring online OpenIB, MX, and shared memory support for Open MPI's checkpoint/restart functionality. Some tuning is still needed, but basic functionality is in place.
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539

Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX,  Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions

This commit was SVN r19756.
2008-10-16 15:09:00 +00:00
Ralph Castain
b46d3e766e Cleanup the plm failed-to-start problem a little - ensure that the event is always defined so we don't have to check when trying to trigger it, thus avoiding potential race conditions.
This commit was SVN r19755.
2008-10-16 14:58:32 +00:00
Ralph Castain
48c3de1865 Fix a problem in the plm "failed to start" code observed by Jeff. When we are unable to launch to a specific node because it doesn't exist or is down, the system would hang and/or segv. The reason for the hang was that we were "firing" the orted exit trigger prior to its timer event being defined - thus "locking" that one-shot and preventing it from firing when we actually were ready to use it.
The segv was caused by the fact that we don't really know which daemon failed to start (at least, in most cases), so we didn't set a pointer to the aborted proc object. All we really wanted, though, was to ensure that mpirun returned a non-zero exit status, so the fix was to simply return the default error status.

This commit was SVN r19754.
2008-10-16 14:21:37 +00:00
Ralph Castain
f0fe8ddb59 Required adjust to LANL platform file
This commit was SVN r19753.
2008-10-16 14:17:41 +00:00
Lenny Verkhovsky
7ab9a72d0f merge with r19717, memory barrier on PPC
I run IMB exchange on two QS22 machines with r19674 and it got stucked after 256 or 512 bytes every time.
After applying r19717 the test passed, so I guess this is a essential patch.

This commit was SVN r19752.

The following SVN revision numbers were found above:
  r19674 --> open-mpi/ompi@15c47a2473
  r19717 --> open-mpi/ompi@0a765cd788
2008-10-15 16:56:42 +00:00
Jeff Squyres
1e14bb305f Don't increase num_peers if the operation fails.
This commit was SVN r19748.
2008-10-15 10:37:20 +00:00
Shiqing Fan
3d4e89a5cd - Remove the unused code introduced with r19480, which was for serializing tcp events on Windows and not successful.
This commit was SVN r19747.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r19480
2008-10-15 08:39:30 +00:00
Shiqing Fan
8b60c755c2 - Bring r19742 into trunk.
- Unify the Windows and the others way of handling callbacks. Thanks to George.
- This will let Windows use the same callbacks as Linux does, which works also.

This commit was SVN r19746.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r19742
2008-10-15 08:14:24 +00:00
Tim Mattox
f6a6c0a74b Bring the trunk NEWS file up to date with the 1.3 NEWS file.
This commit was SVN r19737.
2008-10-14 19:24:15 +00:00
Jeff Squyres
57a3dce9ba LANL noticed that calling MPI_ABORT invokes opal_output(0, ...)
unconditionally, which can result in a flood of messages to the user
if all MPI processes invoke abort.  Additionally, some users were
confused because they saw the MPI_ABORT opal_output() messages from
''some'' MPI processes, but not ''all'' of them (despite the fact that
every MPI process supposedly invoked MPI_ABORT).  The reason is that
calling MPI_ABORT triggers ORTE to kill all MPI processes, so it's a
race condition as to whether a) all MPI processes actually invoke
MPI_ABORT, and/or b) whether every process is able to opal_output()
before they are killed.

This commit does two simple things:
 * Now use orte_show_help() for the MPI_ABORT message, so they are
   aggregated. 
 * Add a note in the message that calling MPI_ABORT kills all
   processes, so you might not see all output, yadda yadda yadda.

This commit was SVN r19735.
2008-10-14 19:23:03 +00:00
Tim Mattox
4ff6bb924f Fix a typo in the NEWS file.
This commit was SVN r19734.
2008-10-14 19:19:56 +00:00
Tim Mattox
0e2f6ebd9e Resync the trunk NEWS file for the v1.2.8 release.
This commit was SVN r19731.
2008-10-14 16:52:46 +00:00
Rainer Keller
856eaf0d42 - Destruct the file->f_io_requests_lock as well.
This commit was SVN r19730.
2008-10-14 15:23:45 +00:00
Rolf vandeVaart
137729d2f9 Fix warnings (thanks Jeff) from previous fix. This is extra
fix for ticket #1554.

This commit was SVN r19728.
2008-10-10 14:35:52 +00:00
Jeff Squyres
8b66171086 Only fail the CPC when the first QP is not PP *if* the CPC uses the
CTS protocol.

This commit was SVN r19726.
2008-10-09 17:31:36 +00:00
Tim Mattox
de623ea161 Remove a redundant if & goto.
This commit was SVN r19724.
2008-10-09 15:07:56 +00:00
Rolf vandeVaart
aad4427caa Fix the implementation of MPI_Reduce_scatter on intercommunicators.
We still do an interreduce but it is now followed by an intrascatterv.

This fixes trac:1554.

This commit was SVN r19723.

The following Trac tickets were found above:
  Ticket 1554 --> https://svn.open-mpi.org/trac/ompi/ticket/1554
2008-10-09 14:35:20 +00:00
Jeff Squyres
f7a94f17b9 Since we now & in the mask, the value can never be higher than the
mask value.  Also, the value is unsigned, so it can never be less than
0.

This commit was SVN r19719.
2008-10-09 13:12:49 +00:00
Jeff Squyres
46d7ffd298 Remove some redundancy from redundant MCA redundant param names. The
following names are all new for v1.3, and therefore haven't been
officially released yet:

 * btl_openib_of_cq_size
 * btl_openib_of_max_inline_data
 * btl_openib_of_pkey
 * btl_openib_of_psn
 * btl_openib_of_mtu

The "_of_" (for OpenFabrics) in there is redundant.  It used to be
"_ib_", indicating that these values are pretty much passed directly
to the verbs stack.  But I think the "openib" in the name implies this
already; having "_of_" in there just seems redundant, makes the name
longer, and seems redundant.  It's also redundant.

So I took those "_of_"'s out of the MCA names.  The old (v1.2) names
are still valid (but deprecated), such ash btl_openib_ib_cq_size.

This commit was SVN r19718.
2008-10-08 21:34:05 +00:00
Jeff Squyres
b8b7619312 * Remove pkey index as an MCA param
* Change name: mca_btl_openib_of_pkey_value -> mca_btl_openib_of_pkey
   (since now there's no index, the "_value" suffix is somewhat
   superfluous)
 * Put in a better help message for the _pkey MCA param (to agree with
   the new help message in v1.2.8)

This commit was SVN r19716.
2008-10-08 20:55:40 +00:00
Ralph Castain
a7afa869af Bring Jeff's changes over from v1.2 that restores the automatic source of .profile for bash and ksh shells.
This commit was SVN r19709.
2008-10-08 14:21:42 +00:00