1
1

68 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
9613b3176c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.

I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.

This commit was SVN r18619.
2008-06-09 14:53:58 +00:00
George Bosilca
e361bcb64c Send optimizations.
1. The send path get shorter. The BTL is allowed to return > 0 to specify that the
   descriptor was pushed to the networks, and that the memory attached to it is 
   available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag
   can be used by the PML to force the BTL to always trigger the callback.
   Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS
   which force the PML to have exactly the same behavior as before. Some BTLs have
   been modified: self, sm, tcp, mx.
2. Add send immediate interface to BTL.
   The idea is to have a mechanism of allowing the BTL to take advantage of
   send optimizations such as the ability to deliver data "inline". Some
   network APIs such as Portals allow data to be sent using a "thin" event
   without packing data into a memory descriptor. This interface change
   allows the BTL to use such capabilities and allows for other optimizations
   in the future. All existing BTLs except for Portals and sm have this interface
   set to NULL.

This commit was SVN r18551.
2008-05-30 03:58:39 +00:00
Jeff Squyres
e7ecd56bd2 This commit represents a bunch of work on a Mercurial side branch. As
such, the commit message back to the master SVN repository is fairly
long.

= ORTE Job-Level Output Messages =

Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):

 * orte_output(): (and corresponding friends ORTE_OUTPUT,
   orte_output_verbose, etc.)  This function sends the output directly
   to the HNP for processing as part of a job-specific output
   channel.  It supports all the same outputs as opal_output()
   (syslog, file, stdout, stderr), but for stdout/stderr, the output
   is sent to the HNP for processing and output.  More on this below.
 * orte_show_help(): This function is a drop-in-replacement for
   opal_show_help(), with two differences in functionality:
   1. the rendered text help message output is sent to the HNP for
      display (rather than outputting directly into the process' stderr
      stream)
   1. the HNP detects duplicate help messages and does not display them
      (so that you don't see the same error message N times, once from
      each of your N MPI processes); instead, it counts "new" instances
      of the help message and displays a message every ~5 seconds when
      there are new ones ("I got X new copies of the help message...")

opal_show_help and opal_output still exist, but they only output in
the current process.  The intent for the new orte_* functions is that
they can apply job-level intelligence to the output.  As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.

=== New code ===

For ORTE and OMPI programmers, here's what you need to do differently
in new code:

 * Do not include opal/util/show_help.h or opal/util/output.h.
   Instead, include orte/util/output.h (this one header file has
   declarations for both the orte_output() series of functions and
   orte_show_help()).
 * Effectively s/opal_output/orte_output/gi throughout your code.
   Note that orte_output_open() takes a slightly different argument
   list (as a way to pass data to the filtering stream -- see below),
   so you if explicitly call opal_output_open(), you'll need to
   slightly adapt to the new signature of orte_output_open().
 * Literally s/opal_show_help/orte_show_help/.  The function signature
   is identical.

=== Notes ===

 * orte_output'ing to stream 0 will do similar to what
   opal_output'ing did, so leaving a hard-coded "0" as the first
   argument is safe.
 * For systems that do not use ORTE's RML or the HNP, the effect of
   orte_output_* and orte_show_help will be identical to their opal
   counterparts (the additional information passed to
   orte_output_open() will be lost!).  Indeed, the orte_* functions
   simply become trivial wrappers to their opal_* counterparts.  Note
   that we have not tested this; the code is simple but it is quite
   possible that we mucked something up.

= Filter Framework =

Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr.  The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations.  The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc.  This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).

Filtering is not active by default.  Filter components must be
specifically requested, such as:

{{{
$ mpirun --mca filter xml ...
}}}

There can only be one filter component active.

= New MCA Parameters =

The new functionality described above introduces two new MCA
parameters:

 * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
   help messages will be aggregated, as described above.  If set to 0,
   all help messages will be displayed, even if they are duplicates
   (i.e., the original behavior).
 * '''orte_base_show_output_recursions''': An MCA parameter to help
   debug one of the known issues, described below.  It is likely that
   this MCA parameter will disappear before v1.3 final.

= Known Issues =

 * The XML filter component is not complete.  The current output from
   this component is preliminary and not real XML.  A bit more work
   needs to be done to configure.m4 search for an appropriate XML
   library/link it in/use it at run time.
 * There are possible recursion loops in the orte_output() and
   orte_show_help() functions -- e.g., if RML send calls orte_output()
   or orte_show_help().  We have some ideas how to fix these, but
   figured that it was ok to commit before feature freeze with known
   issues.  The code currently contains sub-optimal workarounds so
   that this will not be a problem, but it would be good to actually
   solve the problem rather than have hackish workarounds before v1.3 final.

This commit was SVN r18434.
2008-05-13 20:00:55 +00:00
George Bosilca
1bd31aa3ac Cleanup the OMPI_DECLSPEC/OMPI_MODULE_DECLSPEC in the PMLs.
This commit was SVN r17093.
2008-01-09 20:32:39 +00:00
Jeff Squyres
213b5d5c6e Per long threads on the mailing list and much confusion discussion
about linkers, have all OPAL, ORTE, and OMPI components '''not'' link
against the OPAL, ORTE, or OMPI libraries.

See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for
details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a
better-formatted version of the same info).

This commit was SVN r16968.
2007-12-15 13:32:02 +00:00
Rich Graham
27a748e7eb change all instances of ompi_free_list_init to ompi_free_list_init_new. Header
and payload data are specified separately at this stage.

This commit was SVN r16633.
2007-11-01 23:38:50 +00:00
Rich Graham
67f4b69848 propogate fix for out of buffered send memory space to dr and ob1 - thanks
George.

This commit was SVN r16593.
2007-10-27 00:17:53 +00:00
Rich Graham
9c0483088a if unable to get buffered space, try and progress communications to
free up resources.

This commit was SVN r16591.
2007-10-26 23:16:31 +00:00
Gleb Natapov
52c6160252 MCA_PML_BASE_REQUEST_MPI_COMPLETE() macro does nothing except call to
ompi_request_complete(). Remove the macro and call the function directly.

This commit was SVN r16498.
2007-10-18 14:20:24 +00:00
Gleb Natapov
e0a3a7e53e Move duplicated code all over the code to a single function ompi_request_wait_completion().
This commit was SVN r16494.
2007-10-18 12:33:21 +00:00
George Bosilca
e3105a85be Don't require a progress function from the PML. If there is one then the
PML base will take care of the registration with the event library.
Otherwise, (and this apply for the CM case) the MTL are in charge of
registering their own progress function.

This commit was SVN r16415.
2007-10-09 23:28:53 +00:00
George Bosilca
e41ee17ca5 Add a small comment that hopefully will enforce the correct ordering of
the fields between CM and the other PML in the requests structure.

This commit was SVN r15760.
2007-08-03 23:59:29 +00:00
Rich Graham
f2a30cde5d add table of send completion callback functions, on a per send-type
basis.

This commit was SVN r15471.
2007-07-17 21:26:56 +00:00
Rich Graham
0991c3d5f5 move buffered send component clean up out of the pml to ompi_mpi_finalize.
This commit was SVN r15463.
2007-07-17 14:50:52 +00:00
Rich Graham
1a4ce2a961 move setting of the component used to managed buffer sends out of the
pmls, and into ompi_mpi_init.  This is the first of several steps to pull
buffered send management out of the pmls.

This commit was SVN r15451.
2007-07-16 21:52:25 +00:00
George Bosilca
1e825888a5 Fix the problem reported on #1087. The global send and receive requests queues are
now release in the base close, so there is no need for the cm PML to destroy them.

This commit was SVN r15425.
2007-07-13 23:56:09 +00:00
George Bosilca
9ed3ede73e Correct the thin and heavy requests management for the CM PML.
This commit was SVN r15361.
2007-07-11 15:10:01 +00:00
George Bosilca
ef7d17d814 Fix a copy&paste typo.
This commit was SVN r15360.
2007-07-11 15:09:06 +00:00
George Bosilca
9b501eb66d Looks like MAX is not a standard macro. Anyway, that the heavy requests is larger than the thin
seems to be a "correct" assumption.

This commit was SVN r15348.
2007-07-11 00:04:33 +00:00
George Bosilca
e19777e910 A more consistent version. As we now share the send and receive queue, we
have to construct/destruct only once. Therefore, the construction will
happens before digging for a PML, while the destruction just before
finalizing the component.

Add some OPAL_LIKELY/OPAL_UNLIKELY.

This commit was SVN r15347.
2007-07-10 23:45:23 +00:00
George Bosilca
433f8a7694 This patch bring full support for message queues in Open MPI. Now the send and
receive queues are shared among all PMLs, they are declared in the base PML,
and the selected PML is in charge of initializing and releasing them. 

The CM PML is slightly different compared with OB1 or DR. Internally it use
2 different types of requests: light and heavy. However, now with this patch
both types of requests are stored in the same queue, and cast appropriately
on the allocation macro. This means we might use less memory than we allocate,
but in exchange we got full support for most of the parallel debuggers.

Another thing with this patch, is that now for all PML (CM included) the basic
PML requests start with the same fields, and they are declared in the same order
in the request structure. Moreover, the fields have been moved in such a way
that only one volatile/atomic will exist per line of cache (hopefully).

This commit was SVN r15346.
2007-07-10 22:16:38 +00:00
Sven Stork
a04c8eb39a - Bring over the visibility feature, for a finer symbol export control
via the visibility feature that is provided by some compilers.

  Per default this feature is disabled, to enable it you need to
  configure with --enable-visibility and obviously you need a compiler
  with visibility support. Please refer to the wiki for more information.
  https://svn.open-mpi.org/trac/ompi/wiki/Visibility

This commit was SVN r14582.
2007-05-04 09:03:37 +00:00
Jeff Squyres
51f286d737 Just like r14289 on the ORTE trunk:
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).

This commit was SVN r14345.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r14289
2007-04-12 11:19:42 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Sven Stork
870740efe2 - proper export symbols that are required by other components.
This commit was SVN r13841.
2007-02-28 12:51:55 +00:00
Brian Barrett
041beeb1b6 Share currently selected PML in the modex information, then check whenever
adding new procs that the remote proc's pml is the same as our local pml.
Turns the hangs from mismatched PMLs into an abort, which is better,
I think.

This commit was SVN r13582.
2007-02-09 16:38:16 +00:00
Galen Shipman
f98a442c82 Fix a problem in the selection logic for MX. Basically we need to be able to
open MTL MX and BTL MX and initialize them at the same time. The problem is
that both call mx_init and mx_finalize, solution is to add an external entity
that does the init and finalize (based on ref counting).

This commit was SVN r13576.
2007-02-09 03:19:38 +00:00
Galen Shipman
ec610a9e65 spread priorities out a bit..
This commit was SVN r13487.
2007-02-04 00:55:25 +00:00
Galen Shipman
a94101fa62 mostly another hack around for PML selection, allows CM be select itself if an
MTL is available, if not OB1 is used. Still prevents DR and OB1 from stomping
on each other though. 

This commit was SVN r13481.
2007-02-03 02:01:18 +00:00
Brian Barrett
a0b40ce45a Fix race condition in setting MPI_ERROR -- with buffered sends, the
request can complete before the operation, meaning that a bogus MPI_ERROR
is read

This commit was SVN r13401.
2007-01-31 21:40:14 +00:00
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Brian Barrett
99c0a29602 Disable CM and DR PMLs in heterogeneous situtations as neither are
heterogeneous safe.

Refs trac:587

This commit was SVN r12942.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2006-12-30 16:17:56 +00:00
Brian Barrett
01e8fc5f91 Redo of r12871, without the preconnect code change:
Move the req_mtl structure back to the end of each of the structures in 
the CM PML. The req_mtl structure is cast into a mtl_*_request_structure 
for each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was 
moved as an optimization at some point, which caused buffer sends to fail...

Refs trac:669

This commit was SVN r12873.

The following SVN revision numbers were found above:
  r12871 --> open-mpi/ompi@597598b712

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:54:14 +00:00
Brian Barrett
bdf0b231b2 Undo r12871, as it contained some code in ompi/runtime that shouldn't have been
committed

Refs trac:669

This commit was SVN r12872.

The following SVN revision numbers were found above:
  r12871 --> open-mpi/ompi@597598b712

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:52:13 +00:00
Brian Barrett
597598b712 Move the req_mtl structure back to the end of each of the structures in the
CM PML.  The req_mtl structure is cast into a mtl_*_request_structure for
each MTL, which is larger than the req_mtl itself.  The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment).  This was
moved as an optimization at some point, which caused buffer sends to
fail...

Refs trac:669

This commit was SVN r12871.

The following Trac tickets were found above:
  Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
2006-12-15 17:46:53 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
George Bosilca
a38cd366d7 Construct the convertor. It's not really required, but it's not in the
critical path anyway. At least in debug mode we get nice informations about
where the convertor was created.

This commit was SVN r12549.
2006-11-10 20:55:06 +00:00
George Bosilca
858ab24e8e The req_mtl field has to be the last in the struct or bad things happen.
This commit was SVN r12548.
2006-11-10 20:53:41 +00:00
George Bosilca
eb45a5e402 Move things around a little bit. Mainly fields from the send and receive
request in the base request. Rearrange the fields to keep the data
together. Remove some useless tests.

This commit was SVN r12482.
2006-11-08 04:58:23 +00:00
Jeff Squyres
020efdf1f9 Refs trac:250
This commit essentially caches the invoking comm/win/file on the
ompi_request_t. This, paired with the req_type field, allows us to
retrieve the invoking MPI object and invoke the proper errhandler.

The patch is missing most updates for the MPI-2 one-sided stuff (i.e.,
the patch mainly fixes comms and files); I didn't really understand
that code and didn't want to hazard trying to figure it out when Brian
can probably do it much more quickly.

So #250 will still stay open, pending MPI-2 one-sided updates for this
stuff.

This commit was SVN r12339.

The following Trac tickets were found above:
  Ticket 250 --> https://svn.open-mpi.org/trac/ompi/ticket/250
2006-10-27 12:35:27 +00:00
Jeff Squyres
e02114dcf3 Fixes trac:529.
* Create a new request type: NOOP (described below)
 * For all MPI_*_INIT functions, OBJ_NEW an ompi_request_t and set its
   type to NOOP
 * Ensure that the NOOP requests are OBJ_RELEASE'd when they are done
 * MPI_START looks at the request type; if NOOP, just return success. If
   not, call the PML start() function
 * MPI_STARTALL always pass the entire array of requests back to the PML
   (see next point)
 * Make the PMLs only process PML requests (i.e., ignore/skip anything
   that isn't of type PML -- such as the NOOP requests)
 * Add a little more param error checking in STARTALL

This commit was SVN r12338.

The following Trac tickets were found above:
  Ticket 529 --> https://svn.open-mpi.org/trac/ompi/ticket/529
2006-10-27 12:32:36 +00:00
George Bosilca
126a68dc9a Big datatype commit. Remove all unused features of the datatype engine. As the memory
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).

This commit was SVN r12331.
2006-10-26 23:11:26 +00:00
Galen Shipman
2036bf5c3c make smart and dumb compilers happy
This commit was SVN r12178.
2006-10-18 19:33:39 +00:00
George Bosilca
688a16ea78 A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...

This commit was SVN r11730.
2006-09-20 22:14:46 +00:00
George Bosilca
e33c35112b Correct the conversion between int and bool. Apply it on all files except
the one that will be modified by Ralph for the ORTE 2.0. The missing ones
are in the rsh PLS.

This commit was SVN r11476.
2006-08-28 18:59:16 +00:00
Sven Stork
d09863926c - fix dist target
This commit was SVN r11421.
2006-08-25 09:45:08 +00:00
George Bosilca
6f4aa36dfc Add this header file in order to allow the export of the component structure.
This commit was SVN r11404.
2006-08-24 17:40:22 +00:00
George Bosilca
3f0a7cad9e The last patch for Windows support. Mostly casting and conversion to C++ friendly headers.
This commit was SVN r11400.
2006-08-24 16:38:08 +00:00
George Bosilca
6afa4c6c64 Windows friendly version. We have to split the OMPI_DECLSPEC in at least 3
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.

This commit was SVN r11270.
2006-08-20 15:54:04 +00:00
Brian Barrett
f0afe38293 * Need to retain / release datatype and communicator so that the MPI layer
handles can be freed before communication completes.

This commit was SVN r11248.
2006-08-17 16:30:03 +00:00