1
1
Граф коммитов

118 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
8b778903d8 Fix longstanding issue with our multi-project support. Rather than using
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi.  This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.

This commit was SVN r30140.
2014-01-07 22:11:15 +00:00
Ralph Castain
a200e4f865 As per the RFC, bring in the ORTE async progress code and the rewrite of OOB:
*** THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE ***

Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro.

***************************************************************************************

I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week.

The code is in  https://bitbucket.org/rhc/ompi-oob2


WHAT:    Rewrite of ORTE OOB

WHY:       Support asynchronous progress and a host of other features

WHEN:    Wed, August 21

SYNOPSIS:
The current OOB has served us well, but a number of limitations have been identified over the years. Specifically:

* it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code)

* we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface.

* the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients

* there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort

* only one transport (i.e., component) can be "active"


The revised OOB resolves these problems:

* async progress is used for all application processes, with the progress thread blocking in the event library

* each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on")

* multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC.

* a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions.

* opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object

* NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions

* obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel

* the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport

* routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active

* all blocking send/recv APIs have been removed. Everything operates asynchronously.


KNOWN LIMITATIONS:

* although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline

* the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker

* routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways

* obviously, not every error path has been tested nor necessarily covered

* determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when *all* transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost.

* reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways

* the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC

This commit was SVN r29058.
2013-08-22 16:37:40 +00:00
Ralph Castain
8d2fa3693b First cut at removing the native Windows support. Remove all the Windows-specific components, and the .windows files sprinkled around. Remove the Windows platform files and MTT scripts. Update the NEWS to point Windows users to the cygwin package.
This commit was SVN r28116.
2013-02-26 20:44:56 +00:00
Brian Barrett
312f37706e In talking about this with Jeff and Ralph, we don't actually need
ompi_show_help, because opal_show_help is replaced with an 
aggregating version when using ORTE, so there's no reason to
directly call orte_show_help.

This commit was SVN r28051.
2013-02-12 21:10:11 +00:00
Brian Barrett
f42783ae1a Move the RTE framework change into the trunk. With this change, all non-CR
runtime code goes through one of the rte, dpm, or pubsub frameworks.

This commit was SVN r27934.
2013-01-27 23:25:10 +00:00
Samuel Gutierrez
4c28c8cbd0 New sm BTL initialization take two. This approach is pretty simple. Instead of
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.

This commit was SVN r27789.
2013-01-11 16:24:56 +00:00
Samuel Gutierrez
c4acd20eb9 Backout r27739.
This commit was SVN r27745.

The following SVN revision numbers were found above:
  r27739 --> open-mpi/ompi@a159bfaf25
2013-01-05 01:54:23 +00:00
Samuel Gutierrez
a159bfaf25 sm BTL initialization via modex, as discussed at last year's meeting.
This commit was SVN r27739.
2013-01-03 21:52:20 +00:00
Samuel Gutierrez
6188d97e1a Getting out of bed this morning was a bad idea... Reverting the sm update once more because it breaks direct launch. Will address this issue and commit the update once it has all been tested. Sorry everyone!
This commit was SVN r27001.
2012-08-10 22:20:38 +00:00
Samuel Gutierrez
159bd2e62e Let's try this again: sm BTL initialization via modex.
This commit was SVN r26989.
2012-08-10 20:12:36 +00:00
Samuel Gutierrez
6a70063812 Yikes - that's not right! Back out 26987. I'll try again in a bit... Sorry!
This commit was SVN r26988.
2012-08-10 19:57:51 +00:00
Samuel Gutierrez
2c80273246 sm BTL initialization via modex.
This commit was SVN r26987.
2012-08-10 19:51:41 +00:00
Shiqing Fan
204fbfe4b1 update the wv btl component.
This commit was SVN r26872.
2012-07-26 15:35:01 +00:00
Samuel Gutierrez
76d94bf9bf Plug leak. Thanks, Nathan.
This commit was SVN r26846.
2012-07-23 21:11:21 +00:00
Samuel Gutierrez
8096852a16 Towards RML-less shared-memory initialization (primarily for eventual BTL
move).  Extended common sm API with: mca_common_sm_module_create and
mca_common_sm_module_attach. Please note that the new routines aren't currently
used -- but will be...

This commit was SVN r26845.
2012-07-23 19:38:13 +00:00
Josh Hursey
28681deffa Backout the ORCA commit. :(
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.

This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
542330e3a7 Commit of ORCA: Open MPI Runtime Collaborative Abstraction
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.

The project is described on the wiki:
  https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition

And on this email thread:
  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php

This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Samuel Gutierrez
63869c431b init seg_num_procs_inited to zero before the atomic add.
This commit was SVN r25710.
2012-01-11 03:37:23 +00:00
Samuel Gutierrez
d1a44ecd34 send packed buffers instead of using iovecs in common sm rml. this commit will
hopefully resolve the periodic bus errors that some mtt tests have been
encountering.

This commit was SVN r25692.
2012-01-05 00:11:59 +00:00
Samuel Gutierrez
519f71ab7e silences valgrind warning in common sm (Syscall param writev(vector[...]) points
to uninitialised byte(s)).  probably also silences a large stack allocation
warning in coverity.

This commit was SVN r25666.
2011-12-16 23:17:48 +00:00
Samuel Gutierrez
0ca6603fa0 remove some unused cruft in shmem. minor common sm cleanup.
This commit was SVN r25665.
2011-12-16 22:43:55 +00:00
Samuel Gutierrez
375162c693 this commit fixes a few things. 1. silence warning in common sm. 2. remove unneeded config code in common sm. 3. move opal_shmem_base_close to a better place in opal_finalize. 4. fix opal_path_nfs output.
This commit was SVN r25518.
2011-11-28 23:41:19 +00:00
Samuel Gutierrez
b4edf0ff5c getting ready for 1.5 port of the shared memory enhancements. remove some unused/unneeded stuff and minor style update.
This commit was SVN r25513.
2011-11-28 16:08:32 +00:00
George Bosilca
efd88e10d7 Cleanup the error codes. Get rid of all the useless ones, and
mark the distinction between ORTE and OMPI errors.

This commit was SVN r25323.
2011-10-19 03:51:53 +00:00
Samuel Gutierrez
81f38b258a commit of new shared memory backing facility framework (shmem) and its components.
This commit was SVN r24795.
2011-06-21 15:41:57 +00:00
Samuel Gutierrez
0867454a06 Fixes CID #1665.
This commit was SVN r24519.
2011-03-12 03:41:49 +00:00
Samuel Gutierrez
5cff21842a a friday night in sf, nm. fixes CID 1666.
This commit was SVN r24517.
2011-03-12 02:39:31 +00:00
Jeff Squyres
ec90a3ba6d Fix a few memory leaks, and ensure that coll sm is also registering
the common SM MCA params.

This commit was SVN r24497.
2011-03-08 17:36:59 +00:00
Shiqing Fan
f43862420c Convert the bad dos line endings to unix style for all windows related files.
This commit was SVN r24137.
2010-12-02 12:08:08 +00:00
Samuel Gutierrez
c25945ce48 remove one more extra semicolon
This commit was SVN r23954.
2010-10-26 17:30:34 +00:00
Samuel Gutierrez
e1589a2a28 remove an extra semi-colon
This commit was SVN r23953.
2010-10-26 17:23:30 +00:00
Jeff Squyres
73bcc4a36b Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers
This commit was SVN r23801.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-24 22:53:28 +00:00
Samuel Gutierrez
90a132b0a2 disable system v shared memory support when checkpoint/restart is enabled. this combo could presumably work properly someday.
This commit was SVN r23792.
2010-09-22 22:05:07 +00:00
Samuel Gutierrez
1c8f3e1add fix common sm segf when used with cr - thanks to Ananda for finding this issue.
This commit was SVN r23781.
2010-09-20 22:20:43 +00:00
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Samuel Gutierrez
3b572e14ce Fix build issues on Windows. Thanks to Shiqing for pointing this out.
This commit was SVN r23646.
2010-08-24 14:01:05 +00:00
Shiqing Fan
a987eafc90 Add another sm definition for ignoring posix sm on Windows, and exclude those source files.
This commit was SVN r23640.
2010-08-24 09:28:56 +00:00
Samuel Gutierrez
3b162593e6 New POSIX shared memory component and other common sm enhancements.
NOTE: mmap is still the default.

Some highlights:
o Silent component failover.
o The sysv component will only be queried for selection if it is placed before
  the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap).  In the
  default case, sysv will never be queried/selected.
o Per some on-list discussion, now unlinking mmaped file in both mmap and posix
  components (see: "System V Shared Memory for Open MPI: Request for Community
  Input and Testing" thread).
o  Assuming local process homogeneity with respect to all utilized shared
   memory facilities. That is, if one local process deems a particular shared
   memory facility acceptable, then ALL local processes should be able to
   utilize that facility. As it stands, this is an important point because one
   process dictates to all other local processes which common sm component will
   be selected based on its own, local run-time test.
o Addressed some of George's code reuse concerns.

This commit was SVN r23633.
2010-08-23 16:04:13 +00:00
Shiqing Fan
d391c57b0f A more proper fix for the HANDLE definition.
This commit was SVN r23269.
2010-06-14 14:17:07 +00:00
Samuel Gutierrez
2fb7c344fc Added a new System V (sysv) shared memory component for Open MPI.
Configure Option:
--enable-sysv

MCA Parameter:
mpi_common_sm

mpi_common_sm accepts a comma delimited list of: [sysv],mmap (order
dependent).  The first component that is successfully selected is used. For
example, -mca mpi_common_sm sysv,mmap will first try sysv. If sysv is not
successfully selected, then mmap will be used.  mmap will be used if 
mpi_common_sm is not provided.

Notes:
Please make certain that your system's shmmax limit, or equivalent, is larger
than mpool_sm_min_size.  Otherwise, shmget may fail.

This commit was SVN r23260.
2010-06-09 16:58:52 +00:00
Abhishek Kulkarni
afbe3e99c6 * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
 SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
 back the native error code.

* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
  (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
  decode 'ret' to get the native error code.

This commit was SVN r23162.
2010-05-17 23:08:56 +00:00
Samuel Gutierrez
7654b39349 Fix segfault in two error paths.
This commit was SVN r22978.
2010-04-15 15:51:57 +00:00
Josh Hursey
3db01f0795 Add the process name to the error message resulting from a failed mmap(), open(), or ftruncate() so that it is slightly easier to figure out which process in the system caused the problem with sm.
This commit was SVN r22803.
2010-03-10 00:18:04 +00:00
Samuel Gutierrez
dcb5a2331f Fixed some typos in comments.
This commit was SVN r22801.
2010-03-09 20:41:25 +00:00
Eugene Loh
316892b49f Fix spelling of "degradation".
This commit was SVN r22714.
2010-02-25 19:41:59 +00:00
Jeff Squyres
583394e30b This help message got a little jumbled.
This commit was SVN r22689.
2010-02-23 21:09:16 +00:00
Rainer Keller
548d6f7c61 - Incorporated a rewording proposal by Jeff.
This commit was SVN r22670.
2010-02-19 14:37:09 +00:00
Rainer Keller
ea4de16561 - Check whether file is opened on network file-system.
If file does not exist, check the directory it lives in...
   Maybe used by caller, trying to open mmap() on NFS, Lustre or
   Panasas (thanks Sam).
   For now, this is used to warn about the usage of mmap on such FS.

   Please note, that Ralph mentioned the orte_no_session_dir parameter.
   The help message includes a reference to this.

   Tested on NFS and Lustre on Linux on
     smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
     jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...

   Fixes trac:1354

   This should   cmr:v1.5   once it has soaked and is shown to work on
   Solaris

This commit was SVN r22604.

The following Trac tickets were found above:
  Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
2010-02-10 23:18:29 +00:00
Jeff Squyres
a6c1fe888f We also need .so versioning of the OMPI "common" components since they
are installed as standalone libraries in $libdir.

This commit was SVN r22148.
2009-10-27 20:58:34 +00:00
Jeff Squyres
0d1e177453 Remove 2 extraneous ORTE_ERROR_LOGs and 1 extraneous opal_output.
This commit was SVN r22071.
2009-10-07 20:12:37 +00:00