1
1

369 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
cb66115512 Add more optimizations in the case where heterogeneous support
is not enabled.

This commit was SVN r18932.
2008-07-17 04:54:47 +00:00
George Bosilca
1dba362a01 Make the ompi_ddt_dump function externally visible.
This commit was SVN r18668.
2008-06-18 08:37:43 +00:00
Ralph Castain
9613b3176c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.

I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.

This commit was SVN r18619.
2008-06-09 14:53:58 +00:00
George Bosilca
21b940887a Tricky stuff !!! If we post a receive for ZERO bytes and we match it
with something with a different size ... well we segfault. The reason was
that the logic in the PML OB1 call the convertor based on the length
of he data on the wire and not the length of the data that the receiver
expects.

In other words, this is only half a patch :) It fix the problem, but we
still have to make sure the unpack is not called at all when the receiver
expect ZERO bytes.

This commit was SVN r18474.
2008-05-21 23:31:34 +00:00
Jeff Squyres
e7ecd56bd2 This commit represents a bunch of work on a Mercurial side branch. As
such, the commit message back to the master SVN repository is fairly
long.

= ORTE Job-Level Output Messages =

Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):

 * orte_output(): (and corresponding friends ORTE_OUTPUT,
   orte_output_verbose, etc.)  This function sends the output directly
   to the HNP for processing as part of a job-specific output
   channel.  It supports all the same outputs as opal_output()
   (syslog, file, stdout, stderr), but for stdout/stderr, the output
   is sent to the HNP for processing and output.  More on this below.
 * orte_show_help(): This function is a drop-in-replacement for
   opal_show_help(), with two differences in functionality:
   1. the rendered text help message output is sent to the HNP for
      display (rather than outputting directly into the process' stderr
      stream)
   1. the HNP detects duplicate help messages and does not display them
      (so that you don't see the same error message N times, once from
      each of your N MPI processes); instead, it counts "new" instances
      of the help message and displays a message every ~5 seconds when
      there are new ones ("I got X new copies of the help message...")

opal_show_help and opal_output still exist, but they only output in
the current process.  The intent for the new orte_* functions is that
they can apply job-level intelligence to the output.  As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.

=== New code ===

For ORTE and OMPI programmers, here's what you need to do differently
in new code:

 * Do not include opal/util/show_help.h or opal/util/output.h.
   Instead, include orte/util/output.h (this one header file has
   declarations for both the orte_output() series of functions and
   orte_show_help()).
 * Effectively s/opal_output/orte_output/gi throughout your code.
   Note that orte_output_open() takes a slightly different argument
   list (as a way to pass data to the filtering stream -- see below),
   so you if explicitly call opal_output_open(), you'll need to
   slightly adapt to the new signature of orte_output_open().
 * Literally s/opal_show_help/orte_show_help/.  The function signature
   is identical.

=== Notes ===

 * orte_output'ing to stream 0 will do similar to what
   opal_output'ing did, so leaving a hard-coded "0" as the first
   argument is safe.
 * For systems that do not use ORTE's RML or the HNP, the effect of
   orte_output_* and orte_show_help will be identical to their opal
   counterparts (the additional information passed to
   orte_output_open() will be lost!).  Indeed, the orte_* functions
   simply become trivial wrappers to their opal_* counterparts.  Note
   that we have not tested this; the code is simple but it is quite
   possible that we mucked something up.

= Filter Framework =

Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr.  The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations.  The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc.  This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).

Filtering is not active by default.  Filter components must be
specifically requested, such as:

{{{
$ mpirun --mca filter xml ...
}}}

There can only be one filter component active.

= New MCA Parameters =

The new functionality described above introduces two new MCA
parameters:

 * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
   help messages will be aggregated, as described above.  If set to 0,
   all help messages will be displayed, even if they are duplicates
   (i.e., the original behavior).
 * '''orte_base_show_output_recursions''': An MCA parameter to help
   debug one of the known issues, described below.  It is likely that
   this MCA parameter will disappear before v1.3 final.

= Known Issues =

 * The XML filter component is not complete.  The current output from
   this component is preliminary and not real XML.  A bit more work
   needs to be done to configure.m4 search for an appropriate XML
   library/link it in/use it at run time.
 * There are possible recursion loops in the orte_output() and
   orte_show_help() functions -- e.g., if RML send calls orte_output()
   or orte_show_help().  We have some ideas how to fix these, but
   figured that it was ok to commit before feature freeze with known
   issues.  The code currently contains sub-optimal workarounds so
   that this will not be a problem, but it would be good to actually
   solve the problem rather than have hackish workarounds before v1.3 final.

This commit was SVN r18434.
2008-05-13 20:00:55 +00:00
Ralph Castain
fa082cafa9 Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex.
Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer.

This commit was SVN r18198.
2008-04-17 20:43:56 +00:00
George Bosilca
58e31d767e Cleanup.
This commit was SVN r18067.
2008-04-02 06:35:24 +00:00
George Bosilca
9738ee7784 Add the logicalx types to fortran.
This commit was SVN r18066.
2008-04-02 06:34:46 +00:00
George Bosilca
5adaa88241 Cleanup the code and make it a little faster.
This commit was SVN r18038.
2008-03-31 17:12:03 +00:00
Rainer Keller
b7efc2b18e - Coverity issues CID 42:
Event var_deref_model: Variable "array_of_integers" tracked as NULL was
   passed to a function that dereferences it. [model]
   The arrays passed down type_get_contents may be NULL, only iff max_* is 0...
   If the max_* parameter does not fit, an error is returned, anyhow.
   One could improve the checks of MPI_PARAM_CHECK, but to be on the
   safe side, fix in dt_args.c.

This commit was SVN r17974.
2008-03-26 09:07:06 +00:00
Ralph Castain
d70e2e8c2b Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately.
Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer

This commit was SVN r17632.
2008-02-28 01:57:57 +00:00
Brian Barrett
2c142ae0a4 Let file compile with optimized builds (was complaining about undefined
snprintf)

This commit was SVN r17622.
2008-02-27 16:58:38 +00:00
Galen Shipman
6c5c842af6 add include file
This commit was SVN r17483.
2008-02-17 19:36:33 +00:00
George Bosilca
512b24affb Add support for all optional Fortran logical types (MPI_LOGICAL1,
MPI_LOGICAL2, MPI_LOGICAL4 and MPI_LOGICAL8). This commit close
the ticket #331.

This commit was SVN r17473.
2008-02-15 22:54:20 +00:00
George Bosilca
7ae83667e5 Should be #ifdef and not #if.
This commit was SVN r17411.
2008-02-10 21:48:33 +00:00
George Bosilca
f9d05603a1 Minimize the dependencies for the datatype header file (datatype.h)
and update all .c files.

This commit was SVN r17410.
2008-02-10 21:40:47 +00:00
George Bosilca
906e8bf1d1 Replace the ompi_pointer_array with opal_pointer_array. The next step
(sometimes after the merge with the ORTE branch), the opal_pointer_array
will became the only pointer_array implementation (the orte_pointer_array
will be removed).

This commit was SVN r17007.
2007-12-21 06:02:00 +00:00
Shiqing Fan
a0660f4deb - Just some type casts.
This commit was SVN r16100.
2007-09-12 15:29:58 +00:00
George Bosilca
8659a864e9 This is the real fix for ticket 317 and ticket 1065 and ticket 278.
This commit was SVN r16084.
2007-09-10 22:27:59 +00:00
George Bosilca
8622beda54 This commit should fix the issues with ticket 1065. Now, we correctly
duplicate the MPI_UB and MPI_LB datatypes.

This commit was SVN r16083.
2007-09-10 22:13:42 +00:00
Brian Barrett
f53b14bde5 George noted I had this logic completely backwards. Oops.
This commit was SVN r16005.
2007-08-29 16:18:04 +00:00
Brian Barrett
59b22533f2 Enable RDMA for heterogeneous situations. Currently done by overloading
the ompi_convertor_need_buffers function to only return 0 if the convertor
is homogeneous (which it never does on the trunk, but does to on v1.2, but
that's a different issue).  Only enable the heterogeneous rdma code for
a btl if it supports it (via a flag), as some btls need some work for this
to work properly.  Currently only TCP and OpenIB extensively tested

This commit was SVN r15990.
2007-08-28 21:23:44 +00:00
George Bosilca
d6a676b29e Remove unused variable.
This commit was SVN r15693.
2007-07-30 19:38:26 +00:00
George Bosilca
cf9bccf2e6 prefer snprintf to sprintf.
This commit was SVN r15692.
2007-07-30 19:37:34 +00:00
George Bosilca
c961cb5749 The Windows support is now back in bussiness.
This commit was SVN r15599.
2007-07-25 03:55:34 +00:00
George Bosilca
725f776bb2 This patch was originally proposed by Brian, I just did some small optimizations.
It solve the problem with the MPI_Aint alignment that showed up on Solaris
Sparc and on heterogeneous environments when dealing with the data-type description.
The solution is to move the displacement array from the packed array if we
detect that the local architecture required MPI_Aint to be aligned to an
MPI_Aint boundary (which is not the case for x86 architectures if MPI_Aint
is a 64 bits type).

This commit was SVN r15395.
2007-07-13 05:45:02 +00:00
Brian Barrett
1d02b9e7b5 Fix a bunch of issues exposed by Ken Cain in getting Open MPI to work with
VxWorks.  Still some issues remaining, I'm sure.

Refs trac:1010

This commit was SVN r15320.

The following Trac tickets were found above:
  Ticket 1010 --> https://svn.open-mpi.org/trac/ompi/ticket/1010
2007-07-10 03:46:57 +00:00
George Bosilca
11ff1b2c20 Add few OPAL_LIKELY/OPAL_UNLIKELY to the datatype engine.
This commit was SVN r15302.
2007-07-07 04:31:06 +00:00
George Bosilca
245310d7a7 If we have to complain about some pointer problems at least complain
about the right thing.

This commit was SVN r15300.
2007-07-06 16:56:04 +00:00
Rainer Keller
8d24934a80 - Add the missing parts: add MPI_REAL2 to the end of the list
of Fortran datatypes (mpif-common.h) and the list of registered
   datatypes: MOOG(REAL2).
   Configure and Compilation with ia32/gcc just finished, naturally
   without real2.

This commit was SVN r15137.
2007-06-19 20:41:28 +00:00
George Bosilca
768ac4a0d8 Allow creation of a datatype from a packed description on 64 bits environment.
Correct the alignment macros.

This commit was SVN r14898.
2007-06-06 07:30:34 +00:00
George Bosilca
6ccffb0d3e Handle all convertor flags in one common place. Cleanup a little bit the
flags handling.

This commit was SVN r14889.
2007-06-05 23:26:07 +00:00
George Bosilca
ee5552249c Dont set the same flag twice.
This commit was SVN r14875.
2007-06-05 18:01:34 +00:00
George Bosilca
df5394cdc2 Rich found a double call to the _PREPARE macro. Removing it give us some
performance improvements.

This commit was SVN r14873.
2007-06-05 15:41:29 +00:00
Rainer Keller
7575b66131 - The optional Fortran datatypes may not be available
Do not initialize them, if not.
   If initializing them, check for the correct C-equivalent type
   to copy from...
   Issue a warning, when a type (e.g. REAL*16) is not available to
   build the type (here COMPLEX*32).
   This fixes issues with ompi and pacx.

   Works with intel-compiler and FCFLAGS="-i8 -r8" on ia32.

This commit was SVN r14818.

The following SVN revision numbers were found above:
  r8 --> open-mpi/ompi@e952ab1f88
2007-05-31 12:52:06 +00:00
George Bosilca
fff0f21e66 Correctly pack and unpack data description. This might be the fix for
the ticket #919.

This commit was SVN r14812.
2007-05-30 21:56:26 +00:00
Rainer Keller
dd8bea2ea2 - Small comment fixes separately
This commit was SVN r14786.
2007-05-29 15:50:02 +00:00
George Bosilca
146989fee7 Allow for datatype with more than 2^16-1 entries. The new limit is 2^32-1 and it
is enforced at the data-type creation.

This commit was SVN r14758.
2007-05-24 17:24:57 +00:00
George Bosilca
f744e09462 The hopefully final correction for the ticket #919. Make sure we are always aligned
to the max width (MPI_Aint) when we pack the description of a data-type.

This commit was SVN r14754.
2007-05-24 16:08:23 +00:00
George Bosilca
50b26ebb6a Allow the ompi_ddt_init and ompi_ddt_finalize to be visible even when
the visibility feature is on.

This commit was SVN r14726.
2007-05-23 14:02:08 +00:00
Sven Stork
bd29eb9bd1 - backout commit r14667, because internal functionality shouldn't be exported.
NOTE: if visibility is enabled "make check" will fail

This commit was SVN r14668.

The following SVN revision numbers were found above:
  r14667 --> open-mpi/ompi@1f526a95e9
2007-05-16 15:43:44 +00:00
Sven Stork
1f526a95e9 - we need to export this internal symbols because the tests in
test/memory need them.

This commit was SVN r14667.
2007-05-16 15:14:31 +00:00
Sven Stork
91fa494f0e - another missing symbol
This commit was SVN r14657.
2007-05-15 13:38:50 +00:00
Sven Stork
18a5747799 - this symbol is (at least) used by the basic collective component
This commit was SVN r14654.
2007-05-15 12:48:58 +00:00
George Bosilca
cb1b976486 Big update. Correct the behavior for true_lb and true_ub computation
when the size of the data is zero. Now they are not updated, which leave
us with the correct memory layout in all situations (so far). Update all
the comments to reflect exactly the supported behavior of the DDT engine.

This commit was SVN r14202.
2007-04-03 16:05:15 +00:00
George Bosilca
f518a9c1f6 Remove some warnings from the data-type engine.
This commit was SVN r14181.
2007-03-31 04:14:47 +00:00
George Bosilca
1cb26e3b9c Finally the convertor export a convenience function to allow a consistent
computation of the current location on the pack/unpack process. This can
be used both for retrieving the pointer to the first byte (in the special
case of the cached RDMA protocol) and for getting the current
position (for the pipelined protocol).

I modified all BTLs, but most of them are still untested.

This commit was SVN r14180.
2007-03-30 22:02:45 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Brian Barrett
e926bed69f Implement MPI_TYPE_CREATE_DARRAY function. Works with MPICH2 darray-pack
test, Sun's darray test, and an internal LANL test code.  I would not
assume it will work properly on other codes, as I'm still not sure I
completely understand what the standard says this function is supposed to
do.

Refs trac:65

This commit was SVN r13967.

The following Trac tickets were found above:
  Ticket 65 --> https://svn.open-mpi.org/trac/ompi/ticket/65
2007-03-08 16:33:08 +00:00
George Bosilca
4b63631535 Allow correct duplication for MPI_UB and MPI_LB. The problem is that we cannot
create a duplicate type, because any duplicate type lose the PREDEFINED flag.
An MPI_LB (respectively MPI_UB) without the PREDEFINED tag is useless, as it's
not the a marker anymore. The solution is to return the same pointer, but once
the reference count has been increased. In order for this to work, I allowed
the destruction to check for the reference count of an object before complaining
about destroying a predefined type.

This fixed ticket #317.

This commit was SVN r13942.
2007-03-06 18:21:49 +00:00