1
1
Граф коммитов

648 Коммитов

Автор SHA1 Сообщение Дата
Rainer Keller
9dea63d63a - Last of intrusive commits (promised)... err for now.
Anyway, this is blocking the move: do not include pml.h
   if not really needed, aka none of the following used:
     mca_pml
     MCA_PML_CALL
     OMPI_ANY_TAG
     OMPI_ANY_SOURCE
     OMPI_PROC_NULL

 - Notable exceptions (deleting in one header->adding):
   - ompi/mca/mtl/psm/
   - ompi/mca/osc/rdma/
   - ompi/mca/btl/openib/btl_openib_endpoint.c depended on
     pml_base_sendreq.h

 - Tested on Linux/x86-64, this time including make check
   (thanks Jeff and Ralph)

This commit was SVN r20725.
2009-03-04 17:06:51 +00:00
Rainer Keller
811f2bd9b4 - As discussed on RFC, move the ompi_bitmap to the
opal layer.
   Add a check against a maximum (actually get rid of ifs internally to
   opal_bitmap.c) -- the functionality to set the current maximum size
   opal_bitmap_set_max_size() is currently only used in attribute.c
   to set the maximum OMPI_FORTRAN_HANDLE_MAX...

   Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f
   run with 6 procs.
   Let's look into MTT as well...

This commit was SVN r20708.
2009-03-03 22:25:13 +00:00
Rich Graham
7ef1550267 add an index to indicate which socket group I belong to.
This commit was SVN r20672.
2009-03-02 14:39:54 +00:00
Rich Graham
daf7673aff gather socket information - not debugged.`
This commit was SVN r20670.
2009-03-02 10:58:12 +00:00
Rainer Keller
96e1b9b747 - Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml
or ORTE_RML.
   As the others compiles fine with -Wimplicit-function-declaration

This commit was SVN r20639.
2009-02-26 03:52:31 +00:00
Terry Dontje
0178b6c45f Added padding to predefined handle structures to maintain library version to
version compatibility.

This commit was SVN r20627.
2009-02-24 17:17:33 +00:00
Shiqing Fan
2148220ce4 Update the share libs dependency for windows build.
This commit was SVN r20625.
2009-02-23 17:49:46 +00:00
Jeff Squyres
3742c3550c Add "sync" collective component. This component is totally
deactivated by default.  It is activated by setting either of the
following two MCA parameters to values greater than 0:

 * coll_sync_barrier_before
 * coll_sync_barrier_after

If !_before is >0, then the sync coll collective will insert itself
before the underlying collective operations and invoke a barrier
before every Nth barrier (N == coll_sync_barrier_before).  Similar for
!_after.  Note that N is a _per communicator_ value; not global to the
MPI process.

If both are 0 (which is the default), this component returns NULL for
the comm query, meaning that it is not insertted into the coll module
stack. 

The intent of this component is to provide a a workaround for
applications with large numbers of collectives of short messages that
can cause unbounded unexpected messages.  Specifically, it is possible
for some iterative collective communication patterns to cause
unbounded unexpected messages.  Forcing a barrier before or after
every Nth collective operation would prevent that behavior by forcing
applications to synchronize (and thereby consume any outstanding
unexpected messages caused by collectives on the same communicator).

Open MPI still needs to bound unexpected messages resource consumption
at the receiver, but this is a viable workaround for at least some
symptoms of the problem.

Additionally, there has been anecdotal evidence of some applications
that "perfom better" when they put barriers after other collective
operations.  This could be due to many factors -- including shortening
the unexpected message queue.  Putting this component in Open MPI
allows people to try this with their own applications and give real
world feedback on this kind of behavior.

This commit was SVN r20584.
2009-02-18 23:32:44 +00:00
Rainer Keller
d81443cc5a - On the way to get the BTLs split out and lessen dependency on orte:
Often, orte/util/show_help.h is included, although no functionality
   is required -- instead, most often opal_output.h, or               
   orte/mca/rml/rml_types.h                                           
   Please see orte_show_help_replacement.sh commited next.            

 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   actually showed two *missing* #include "orte/util/show_help.h"     
   in orte/mca/odls/base/odls_base_default_fns.c and                  
   in orte/tools/orte-top/orte-top.c                                  
   Manually added these.                                              

   Let's have MTT the last word.

This commit was SVN r20557.
2009-02-14 02:26:12 +00:00
Jeff Squyres
8b29e27ead Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20543.
2009-02-13 03:45:32 +00:00
Tim Mattox
9b83df22ec Fix some "is proc on local node?" logic that got accidentally flipped
by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules.

This commit was SVN r20515.

The following SVN revision numbers were found above:
  r20496 --> open-mpi/ompi@4cdf91a8d4
2009-02-11 15:02:38 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Jeff Squyres
4d8a187450 Two major things in this commit:
* New "op" MPI layer framework
 * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2)

= Op framework =

Add new "op" framework in the ompi layer.  This framework replaces the
hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples
for pre-defined MPI_Ops, allowing components and modules to provide
the back-end functions.  The intent is that components can be written
to take advantage of hardware acceleration (GPU, FPGA, specialized CPU
instructions, etc.).  Similar to other frameworks, components are
intended to be able to discover at run-time if they can be used, and
if so, elect themselves to be selected (or disqualify themselves from
selection if they cannot run).  If specialized hardware is not
available, there is a default set of functions that will automatically
be used.

This framework is ''not'' used for user-defined MPI_Ops.

The new op framework is similar to the existing coll framework, in
that the final set of function pointers that are used on any given
intrinsic MPI_Op can be a mixed bag of function pointers, potentially
coming from multiple different op modules.  This allows for hardware
that only supports some of the operations, not all of them (e.g., a
GPU that only supports single-precision operations).

All the hard-coded back-end MPI_Op functions for (MPI_Op,
MPI_Datatype) tuples still exist, but unlike coll, they're in the
framework base (vs. being in a separate "basic" component) and are
automatically used if no component is found at runtime that provides a
module with the necessary function pointers.

There is an "example" op component that will hopefully be useful to
those writing meaningful op components.  It is currently
.ompi_ignore'd so that it doesn't impinge on other developers (it's
somewhat chatty in terms of opal_output() so that you can tell when
its functions have been invoked).  See the README file in the example
op component directory.  Developers of new op components are
encouraged to look at the following wiki pages:

  https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework

= MPI_REDUCE_LOCAL =

Part of the MPI-2.2 proposal listed here:

    https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24

is to add a new function named MPI_REDUCE_LOCAL.  It is very easy to
implement, so I added it (also because it makes testing the op
framework pretty easy -- you can do it in serial rather than via
parallel reductions).  There's even a man page!

This commit was SVN r20280.
2009-01-14 23:44:31 +00:00
Edgar Gabriel
1072812bcf not every element in the pointer array list contains a valid entry. Thus, do not try to free elements if the list returns NULL.
This commit was SVN r20275.
2009-01-14 19:11:30 +00:00
George Bosilca
01adc999c5 Correctly forward the right module if we call another collective function. Kudos to
Edgar for figuring out this tricky bug.

This commit was SVN r20267.
2009-01-14 03:22:54 +00:00
Jeff Squyres
11b375f8b5 CIDs 1080-1090: assert() checks were not sufficient to check for
NEGATIVE_RETURNS from _reg_int() because those are not always
checked.  So replace them with real if() checks.

This commit was SVN r20195.
2009-01-03 15:56:25 +00:00
Jeff Squyres
f13ea32830 Remove the code checkig the MCA "coll" parameter for a list of coll
components to use.  This code was rendered obsolete (albiet harmless)
by the MCA base improvements that only open the components that were
specified by each framework's MCA parameter.

This commit was SVN r20176.
2008-12-31 13:40:51 +00:00
Jeff Squyres
759a295cc9 Gaah -- missed one s/m/component/g
This commit was SVN r20175.
2008-12-31 13:35:37 +00:00
Jeff Squyres
955d1e132d Rename a variable to be "component" (not "m"), to emphasize that it is
the component struct, not a module.

This commit was SVN r20174.
2008-12-31 13:32:46 +00:00
Jeff Squyres
865900dd27 Nothing of substance; just indenting changes (''finally'' update this
framework base to 4 space tabs!).

This commit was SVN r20173.
2008-12-31 12:17:08 +00:00
Jeff Squyres
ce313fa391 Minor fixes to a few comments
This commit was SVN r20172.
2008-12-31 11:34:27 +00:00
Jeff Squyres
d533215dac Fix a comment to reflect the right version number
This commit was SVN r20169.
2008-12-30 12:39:32 +00:00
Nysal Jan
ee8ec6f6b5 Remove dead/redundant code. Minimize number of calloc invocations
This commit was SVN r20121.
2008-12-12 10:55:50 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Rolf vandeVaart
137729d2f9 Fix warnings (thanks Jeff) from previous fix. This is extra
fix for ticket #1554.

This commit was SVN r19728.
2008-10-10 14:35:52 +00:00
Tim Mattox
de623ea161 Remove a redundant if & goto.
This commit was SVN r19724.
2008-10-09 15:07:56 +00:00
Rolf vandeVaart
aad4427caa Fix the implementation of MPI_Reduce_scatter on intercommunicators.
We still do an interreduce but it is now followed by an intrascatterv.

This fixes trac:1554.

This commit was SVN r19723.

The following Trac tickets were found above:
  Ticket 1554 --> https://svn.open-mpi.org/trac/ompi/ticket/1554
2008-10-09 14:35:20 +00:00
Rolf vandeVaart
13e8975f83 In the case where we detect a value of 0 in the recvcount
array, fall back to the simpler algorithms.  This is not
the optimal solution, but it works.  

This commit was SVN r19702.
2008-10-07 19:44:51 +00:00
Rolf vandeVaart
0a0ddfc934 Handle MPI_IN_PLACE correctly in the ompi_coll_tuned_reduce_scatter_intra_ring function.
We were not adjusting the sendbuf in this case so we were reducing garbage.

This fixes ticket #1506.

This commit was SVN r19673.
2008-10-02 20:01:27 +00:00
George Bosilca
325d006577 Mostly cleanups, and eventually a little bit more scalable add_procs.
There was an argument that was barely used, and on return at the PML
level it contained nothing usable. It has been removed, so now we're
using less memory ...

This commit was SVN r19657.
2008-09-30 15:47:43 +00:00
George Bosilca
6a9514ee08 Make the code match the comment. I checked with Jelena, and based on the papers we
published this is the expected algorithm for the specified message and communicator
size.

This commit closes ticket #1330.

This commit was SVN r19563.
2008-09-15 23:28:40 +00:00
Edgar Gabriel
ef2bb46e45 no need to create and free the groups. We just want to translate the ranks and we can use the internal group structures right away for that operation. Fixes an issue with groups that have not been freed previously, due to the fact that ompi_group_free was not visible here (I know, this could have been solved also by setting OMPI_DECLSPEC on ompi_group_free, but this solution should be faster.)
This commit was SVN r19362.
2008-08-19 13:59:58 +00:00
Edgar Gabriel
149ecb8d7d 1. debug the four new algorithms
2. fix a bug in the initial communicator creation of llcomm
3. fix a bug which showed up as the result of fixing issue number 2: we have
   to check now whether llcomm has really be created before freeing the 
   according llcomm in hierarch_destruct.

This commit was SVN r19361.
2008-08-18 21:54:35 +00:00
Edgar Gabriel
7cbc4a4077 adding four different algorithms for a hierarchical bcast which try to
generate an overlap between the different layers. Why four versions? Because
there is right now always the trade-off between using non-blocking operations
on a layer with a trivial, linear algorithm and using the more sophisticaed
algorithms in a blocking manner. 

- bcast_intra_seg used the bcast of lcomm and llcomm, similarly
  to original algorithm in hierarch. However, it can segment
  the message, such that we might get an overlap between the two
  layers. This overlap is based on the assumption, that a process
  might be done early with a bcast and can start the next one.
- bcast_intra_seg1: replaces the llcomm->bcast by isend/irecvs
  to increase the overlap, keeps the lcomm->bcast however
- bcast_intra_seg2: replaced lcomm->bcast by isend/irecvs
  to increase the overlap, keeps however llcomm->bcast
- bcast_intra_seg3: replaced both lcomm->bcast and llcomm->bcast
  by isend/irecvs

The code is lightly tested, more testing to follow right now.

This commit was SVN r19358.
2008-08-18 16:05:44 +00:00
George Bosilca
a6e3a47102 Fix typo.
This commit was SVN r19312.
2008-08-17 20:08:38 +00:00
Rich Graham
e64f028d62 add missing header file for errno.
This commit was SVN r19246.
2008-08-12 01:34:13 +00:00
Jeff Squyres
54ab811426 Fix CID 1036: minor resource leak on error
This commit was SVN r19236.
2008-08-11 20:37:36 +00:00
Rainer Keller
ee1fe9015a - Make sure, that the *param_index are > 0 (here, we don't pass
errors up...).
   Coverity CID 1080 - 1090
 - Really make sure, the user does not specify stupid negative values.

This commit was SVN r19233.
2008-08-11 11:21:04 +00:00
George Bosilca
3c8d43deed Remove unused variable (Coverty fix 178).
This commit was SVN r19195.
2008-08-06 14:09:43 +00:00
George Bosilca
567c691354 Remove unused variable (Coverty fix 177).
This commit was SVN r19194.
2008-08-06 14:08:34 +00:00
George Bosilca
f6ebdf8896 Remove unused variable (Coverty fix 176).
This commit was SVN r19193.
2008-08-06 14:07:20 +00:00
George Bosilca
c021427002 Remove unused variable (Coverty fix 175).
This commit was SVN r19192.
2008-08-06 14:06:08 +00:00
George Bosilca
6c8017e9b7 Remove unused variable (Coverty fix 174).
This commit was SVN r19191.
2008-08-06 14:04:54 +00:00
George Bosilca
afc79d1651 Remove unused variable (Coverty fix 173).
This commit was SVN r19190.
2008-08-06 14:03:33 +00:00
George Bosilca
5e3a5b7c13 Remove unused variable (Coverty fix 172).
This commit was SVN r19188.
2008-08-06 14:01:33 +00:00
George Bosilca
d897710e4f Remove unused variable (Coverty fix 171).
This commit was SVN r19187.
2008-08-06 14:00:22 +00:00
George Bosilca
417b727006 Remove unused variable (Coverty fix 170).
This commit was SVN r19186.
2008-08-06 13:59:03 +00:00
George Bosilca
4f91b7806c Remove unused variable (Coverty fix 169).
This commit was SVN r19185.
2008-08-06 13:57:43 +00:00
Rainer Keller
23c2292478 - Fix variable set but not used
Coverity CID1058

This commit was SVN r19184.
2008-08-06 13:57:38 +00:00
Jeff Squyres
0af7ac53f2 Fixes trac:1392, #1400
* add "register" function to mca_base_component_t
   * converted coll:basic and paffinity:linux and paffinity:solaris to
     use this function
   * we'll convert the rest over time (I'll file a ticket once all
     this is committed)
 * add 32 bytes of "reserved" space to the end of mca_base_component_t
   and mca_base_component_data_2_0_0_t to make future upgrades
   [slightly] easier
   * new mca_base_component_t size: 196 bytes
   * new mca_base_component_data_2_0_0_t size: 36 bytes
 * MCA base version bumped to v2.0
   * '''We now refuse to load components that are not MCA v2.0.x'''
 * all MCA frameworks versions bumped to v2.0
 * be a little more explicit about version numbers in the MCA base
   * add big comment in mca.h about versioning philosophy

This commit was SVN r19073.

The following Trac tickets were found above:
  Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392
2008-07-28 22:40:57 +00:00
Jeff Squyres
d37a25a2d0 Remove per http://www.open-mpi.org/community/lists/devel/2008/07/4386.php
This commit was SVN r18972.
2008-07-22 00:57:23 +00:00
Edgar Gabriel
798f47b430 Fixes ticket #1334
hierarch disables itself now if the pml module used is *not* ob1. The reason
is, that the multi-level hierarchy detection algorithm checks the names of the
btl modules used. In case there are no btl's, we would segfault.

Furthermore, three minor changes:
 - the 2-level hierarchy detection is now the default (sm vs. everything else
 in the world).
 - add udapl to the list of protocols checked for by the multi-level hierarch detection
 - some of the verbose statements of hierarch were inaccurate. Fixed those comments/messages.

This commit was SVN r18817.
2008-07-07 18:44:48 +00:00
Ralph Castain
9613b3176c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.

I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.

This commit was SVN r18619.
2008-06-09 14:53:58 +00:00
Ralph Castain
c992e99035 Remove the tags from orte_output_open and the filtering operation from orte_output - this will be handled differently to improve the XML output interface
This commit was SVN r18557.
2008-06-03 14:24:01 +00:00
Rolf vandeVaart
18879285c7 Fix the selection logic to prevent memory leaks. More work may be done in the priority logic but for now we just fix the leaks and preserve current behavior.
This commit fixes trac:1307.

This commit was SVN r18504.

The following Trac tickets were found above:
  Ticket 1307 --> https://svn.open-mpi.org/trac/ompi/ticket/1307
2008-05-27 14:16:39 +00:00
Rolf vandeVaart
5baa733ad5 Fix another warning (using a variable before it was initialized.)
Thanks Jeff for pointing this out.

This commit was SVN r18489.
2008-05-23 13:57:55 +00:00
Rich Graham
b08839f9f5 change reduce-scatter/gather for non-power of 2. Spreading out the
load for the non-power of 2 phase of the reduction.

This commit was SVN r18486.
2008-05-22 21:42:42 +00:00
Rich Graham
f2a4b67809 automate the allreduce selection logic.
This commit was SVN r18484.
2008-05-22 20:53:35 +00:00
Rich Graham
5900415a25 for non-powers of 2, distribute the work on the first step among all
the procs doing the work.

This commit was SVN r18480.
2008-05-22 18:50:53 +00:00
George Bosilca
c31cc5b270 Remove a warning about line being unused.
This commit was SVN r18472.
2008-05-21 20:46:22 +00:00
Edgar Gabriel
0500420bec fixing a bug in the inter-communicator scatter operation, where we used
accidentally rcount instead of scounts.

This commit was SVN r18466.
2008-05-20 21:17:19 +00:00
Rolf vandeVaart
74d0259480 Add new implentation of barrier. This shows better performance on some clusters.
However, no decision logic is changed by this commit so default behavior has not changed.  This
is only selectable by runtime parameters.

This commit was SVN r18464.
2008-05-20 17:37:41 +00:00
Rolf vandeVaart
71091a19c3 Fix bug in spacing of code per https://svn.open-mpi.org/trac/ompi/wiki/CodingStyle.
This commit was SVN r18463.
2008-05-20 14:11:10 +00:00
Rolf vandeVaart
763f5259a8 Fix memory leak of 88 bytes that occurred on each call to MPI_Comm_dup.
Need to release the items and the item list after selecting the collective
modules that are being used.  Reviewed by Jeff Squyres.

This commit was SVN r18457.
2008-05-19 21:34:01 +00:00
Jeff Squyres
7154776465 Removed unused variable / compiler warning.
This commit was SVN r18454.
2008-05-19 13:41:45 +00:00
Rolf vandeVaart
375406e1fa Remove the ignore files as decided at Tuesday's developers conference call. Now, hierarchical collectives will be compiled in but the priority is still at 0 requiring a user to set mca parameters to enable them.
This commit was SVN r18440.
2008-05-15 01:26:52 +00:00
Jeff Squyres
671f0c379d Remove a whole pile of orte/util/show_help.h's that I missed. :-(
This commit was SVN r18437.
2008-05-14 11:32:33 +00:00
Jeff Squyres
e7ecd56bd2 This commit represents a bunch of work on a Mercurial side branch. As
such, the commit message back to the master SVN repository is fairly
long.

= ORTE Job-Level Output Messages =

Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):

 * orte_output(): (and corresponding friends ORTE_OUTPUT,
   orte_output_verbose, etc.)  This function sends the output directly
   to the HNP for processing as part of a job-specific output
   channel.  It supports all the same outputs as opal_output()
   (syslog, file, stdout, stderr), but for stdout/stderr, the output
   is sent to the HNP for processing and output.  More on this below.
 * orte_show_help(): This function is a drop-in-replacement for
   opal_show_help(), with two differences in functionality:
   1. the rendered text help message output is sent to the HNP for
      display (rather than outputting directly into the process' stderr
      stream)
   1. the HNP detects duplicate help messages and does not display them
      (so that you don't see the same error message N times, once from
      each of your N MPI processes); instead, it counts "new" instances
      of the help message and displays a message every ~5 seconds when
      there are new ones ("I got X new copies of the help message...")

opal_show_help and opal_output still exist, but they only output in
the current process.  The intent for the new orte_* functions is that
they can apply job-level intelligence to the output.  As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.

=== New code ===

For ORTE and OMPI programmers, here's what you need to do differently
in new code:

 * Do not include opal/util/show_help.h or opal/util/output.h.
   Instead, include orte/util/output.h (this one header file has
   declarations for both the orte_output() series of functions and
   orte_show_help()).
 * Effectively s/opal_output/orte_output/gi throughout your code.
   Note that orte_output_open() takes a slightly different argument
   list (as a way to pass data to the filtering stream -- see below),
   so you if explicitly call opal_output_open(), you'll need to
   slightly adapt to the new signature of orte_output_open().
 * Literally s/opal_show_help/orte_show_help/.  The function signature
   is identical.

=== Notes ===

 * orte_output'ing to stream 0 will do similar to what
   opal_output'ing did, so leaving a hard-coded "0" as the first
   argument is safe.
 * For systems that do not use ORTE's RML or the HNP, the effect of
   orte_output_* and orte_show_help will be identical to their opal
   counterparts (the additional information passed to
   orte_output_open() will be lost!).  Indeed, the orte_* functions
   simply become trivial wrappers to their opal_* counterparts.  Note
   that we have not tested this; the code is simple but it is quite
   possible that we mucked something up.

= Filter Framework =

Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr.  The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations.  The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc.  This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).

Filtering is not active by default.  Filter components must be
specifically requested, such as:

{{{
$ mpirun --mca filter xml ...
}}}

There can only be one filter component active.

= New MCA Parameters =

The new functionality described above introduces two new MCA
parameters:

 * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
   help messages will be aggregated, as described above.  If set to 0,
   all help messages will be displayed, even if they are duplicates
   (i.e., the original behavior).
 * '''orte_base_show_output_recursions''': An MCA parameter to help
   debug one of the known issues, described below.  It is likely that
   this MCA parameter will disappear before v1.3 final.

= Known Issues =

 * The XML filter component is not complete.  The current output from
   this component is preliminary and not real XML.  A bit more work
   needs to be done to configure.m4 search for an appropriate XML
   library/link it in/use it at run time.
 * There are possible recursion loops in the orte_output() and
   orte_show_help() functions -- e.g., if RML send calls orte_output()
   or orte_show_help().  We have some ideas how to fix these, but
   figured that it was ok to commit before feature freeze with known
   issues.  The code currently contains sub-optimal workarounds so
   that this will not be a problem, but it would be good to actually
   solve the problem rather than have hackish workarounds before v1.3 final.

This commit was SVN r18434.
2008-05-13 20:00:55 +00:00
Rolf vandeVaart
0e32dd1022 Add MPI_Alltoallv to tuned collectives and add a pairwise implementation of MPI_Alltoallv. However, do not change the default behavior for now. The only way to use new pairwise implementation is via mca parameters.
This commit was SVN r18394.
2008-05-07 02:31:24 +00:00
Rich Graham
4d1ae7b05f accidentally made a change in the wrong place.
This commit was SVN r18262.
2008-04-23 17:32:05 +00:00
Rich Graham
293dd6ad4e add myself to list of people building this module.
This commit was SVN r18261.
2008-04-23 17:25:36 +00:00
Rich Graham
7658cc79e4 Pass in the correct module to the reduction call.
This commit was SVN r18260.
2008-04-23 17:23:30 +00:00
Tim Mattox
0215474cb8 Fix two bugs in coll_sm_module.c from bit-rot:
Fixed a selection bug, and removed a bogus "free(proc)" call
which ultimately caused MPI_Finalize to crash.

This commit was SVN r18235.
2008-04-22 18:41:21 +00:00
Rich Graham
df35223603 add selection logic for barrier and reduce.
This commit was SVN r18215.
2008-04-19 22:40:04 +00:00
Rich Graham
bee8b42f29 remove debug code that would not let people run.
Add infrastructure for blocking-barrier.

This commit was SVN r18214.
2008-04-19 01:34:04 +00:00
Rich Graham
6c77fa4921 add a blocking shared memory algorithm.
This commit was SVN r18185.
2008-04-16 22:10:23 +00:00
Rich Graham
249445d61f added reduce-scatter followed by gather to root.
This commit was SVN r18133.
2008-04-11 13:49:08 +00:00
Rich Graham
a6bdbfab97 implement allreduce as reduce-scatter, followed by an allgather.
This commit was SVN r18132.
2008-04-11 04:06:29 +00:00
Rich Graham
70f3aab5f2 remove some code that is not needed.
This commit was SVN r18128.
2008-04-10 17:32:04 +00:00
Rich Graham
5c7db1e315 remove 2 race conditions in the buffer recycling logic.
This commit was SVN r18127.
2008-04-10 17:20:52 +00:00
Edgar Gabriel
4964434205 reverting commit 18122, since the commit was executed accidentally in the
wring directory. The UH copyrights do belong into this file (i.e. because of
the fix which is in the 1.2 branch, the UH copyright notes are in the header
there alreary), but I want to have the proper log for that.  

This commit was SVN r18124.
2008-04-10 15:09:31 +00:00
Edgar Gabriel
f87830767a the verification of recvcount==0 and rank = root was braking
inter-communicator scatter, since the root (root==MPI_ROOT) might very well
have recvcount=0. The same fix has been applied to gather.c just the other way
round. 
 
Fixes the bug reported on the mainling list by Martin Audet. If there is a
1.2.7 this fix might be worthwhile porting it over.

Please note, that while the test works now for basic and for inter, we get a
0byte malloc warning from the inter module, which we still have to fix in a
separate patch.

This commit was SVN r18122.
2008-04-10 14:58:51 +00:00
Rich Graham
c6783549ef getting old
This commit was SVN r18110.
2008-04-09 16:55:16 +00:00
Rich Graham
1a20c3ce51 more debug.
This commit was SVN r18109.
2008-04-09 16:19:52 +00:00
Rich Graham
e7e18303f6 more debug.
This commit was SVN r18108.
2008-04-09 15:10:58 +00:00
Rich Graham
b14c6b17d5 adding debug output.
This commit was SVN r18107.
2008-04-09 13:32:01 +00:00
Rich Graham
10434fb2f1 add barrier synchorinzation at the end of the module init, to
avoid initializing shared memory variables in use.

This commit was SVN r18105.
2008-04-09 03:44:40 +00:00
Rich Graham
19bb1a2e86 fix initialization bug.
This commit was SVN r18104.
2008-04-08 23:34:06 +00:00
Rich Graham
a69a8d9626 initialize the flags.
This commit was SVN r18102.
2008-04-08 22:16:39 +00:00
Rich Graham
8765a2bbdd more debug code.
This commit was SVN r18101.
2008-04-08 20:38:20 +00:00
Rich Graham
08becf33b5 add more debugging.
This commit was SVN r18100.
2008-04-08 18:44:50 +00:00
Rich Graham
aa1b7dd406 more debug
This commit was SVN r18099.
2008-04-08 03:56:47 +00:00
Rich Graham
0c18bdeff7 more debug code.
This commit was SVN r18098.
2008-04-08 03:04:20 +00:00
Rich Graham
9d5a7238df Add some debugging code.
This commit was SVN r18097.
2008-04-07 23:20:15 +00:00
Rich Graham
fa696734d5 add some debug code.
This commit was SVN r18096.
2008-04-07 21:03:23 +00:00
Rich Graham
1b54e8b76e fix buffer management for nb-barrier.
This commit was SVN r18081.
2008-04-05 21:59:04 +00:00
Rich Graham
94f8fd365c a few reduction optimizations. Add bcast.
This commit was SVN r18075.
2008-04-02 19:02:33 +00:00
George Bosilca
a00ca20446 More cleanups.
This commit was SVN r18069.
2008-04-02 06:38:33 +00:00
Rich Graham
eb5d6096f1 add reduction routine - fix buffer recycling logic which was totally
broken.

This commit was SVN r18065.
2008-04-01 22:56:18 +00:00
Rich Graham
90e53ca9ee debug the pipeline algorithm.
This commit was SVN r18008.
2008-03-28 15:10:07 +00:00
Rich Graham
e2ad9c4be2 adjust to change in orte_process_info.
This commit was SVN r17986.
2008-03-27 01:25:28 +00:00
Rich Graham
441fb9fb9e checkpoint.
This commit was SVN r17985.
2008-03-27 01:16:32 +00:00
Ralph Castain
cca449e379 Move an OMPI RML tag to the OMPI layer
This commit was SVN r17950.
2008-03-25 13:30:48 +00:00
Ralph Castain
dc7f45dafd Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure.
Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code.

This commit was SVN r17926.
2008-03-23 23:10:15 +00:00
Rich Graham
a7c836a2b0 fix location of the restrict key word.
Make the tag in the fan-in/fan-out algorithm be fragment based.

This commit was SVN r17903.
2008-03-21 01:40:36 +00:00
Rich Graham
2c66d396b7 take care of some bit-rot with the fanin-fanout method.
This commit was SVN r17902.
2008-03-21 01:08:49 +00:00
Rich Graham
b9520e61dc get the sm optimized allreduce working for all but user defined
operations.  Added to the reduction operations a set of reduction
functions that take 2 input buffers and one output buffer to avoid
some extra memory copies.  These can't be used with user defined
operations.  The intel c collective suite passes both original, and
new (new, not the user defined operations).

This commit was SVN r17901.
2008-03-20 23:51:16 +00:00
Edgar Gabriel
570bbea5e0 fixing the allgather problem reported on the mailing list. The problem was
that at one locatin we had the local-size instead of the remote size as a
receive argument.

This commit was SVN r17849.
2008-03-17 19:42:18 +00:00
Rich Graham
27182afb67 get the timers in correctly.
This commit was SVN r17832.
2008-03-16 03:25:16 +00:00
Rich Graham
afcd1016fd move temp buffer allocation out of the iteration loop - i.e. always use the
same temp loop.  The algorithm is rather synchronous already...

This commit was SVN r17831.
2008-03-16 03:20:46 +00:00
Rich Graham
a1766b29f6 fix some barrier addressing errors.
This commit was SVN r17830.
2008-03-15 22:46:19 +00:00
Rich Graham
0453e7d2f4 bug in management memory allocation - too much memory allocated.
This commit was SVN r17829.
2008-03-15 18:12:20 +00:00
Rich Graham
3c2f1eb8bf reduce the number of temp buffers used.
This commit was SVN r17828.
2008-03-15 17:23:04 +00:00
Rich Graham
0f9d642d51 temp buffer pointers are computed when they are set up. A bit more
efficient, but more important, it is much easier to play around with
memory layout now.

This commit was SVN r17827.
2008-03-15 16:36:35 +00:00
Rich Graham
e3e336b5ab check point
This commit was SVN r17826.
2008-03-15 13:31:21 +00:00
Rich Graham
ebcf928c24 add some diagnostics.
This commit was SVN r17789.
2008-03-07 22:27:41 +00:00
Rich Graham
9131461511 move some test code to another machine.
This commit was SVN r17785.
2008-03-07 19:18:02 +00:00
Rich Graham
c230b65543 fix a couple of bugs. Recursive doubling seems to be working.
This commit was SVN r17777.
2008-03-07 02:51:38 +00:00
Rich Graham
70157166f9 checkpoint - compiles, now neeed to debug.
This commit was SVN r17775.
2008-03-07 00:39:59 +00:00
Rich Graham
4eace9d020 starting to implement recursive doubling algorithm.
This commit was SVN r17765.
2008-03-06 18:38:58 +00:00
Rich Graham
67ad9b6d6b increase max data segments size.
This commit was SVN r17677.
2008-03-02 19:11:09 +00:00
Rich Graham
53126fa7bd add calls to opal_progress()
This commit was SVN r17673.
2008-02-29 23:25:09 +00:00
Rich Graham
d37db14901 get the shared memory collectives working again with the new
version of orte.

This commit was SVN r17672.
2008-02-29 22:28:57 +00:00
Rich Graham
c253a7bda1 simplify the code abit.
This commit was SVN r17664.
2008-02-29 03:55:12 +00:00
Rich Graham
1632d8b299 revert to an older (not previosly checked in) version to get around a
regression.

This commit was SVN r17663.
2008-02-29 03:12:12 +00:00
Rich Graham
827e8d877e fix bug in node type, and some memory copy optimizations.
This commit was SVN r17661.
2008-02-29 01:20:11 +00:00
Rich Graham
940d6732c9 remove compiler warnings.
This commit was SVN r17656.
2008-02-28 22:01:19 +00:00
Rich Graham
2b5fab9d51 avoid 0 byte malloc.
This commit was SVN r17653.
2008-02-28 21:11:42 +00:00
Rich Graham
4b26adef00 remove some debug output.
This commit was SVN r17650.
2008-02-28 20:54:35 +00:00
Rich Graham
5df6c6d043 fix several race conditions.
This commit was SVN r17645.
2008-02-28 19:40:19 +00:00
Ralph Castain
d70e2e8c2b Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately.
Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer

This commit was SVN r17632.
2008-02-28 01:57:57 +00:00
Rich Graham
68aa691171 checkpoint work.
This commit was SVN r17620.
2008-02-27 14:56:36 +00:00
Rich Graham
b4bbb70bb7 got it all, but for the mem copies. Also, need to make sure volatile declarations are all inplace, as well as memory barriers.
This commit was SVN r17572.
2008-02-25 00:16:21 +00:00
Rich Graham
2d8c2420e8 checkpoint.
This commit was SVN r17571.
2008-02-24 20:54:16 +00:00
Rich Graham
771584bff5 generate reduction tree.
This commit was SVN r17569.
2008-02-24 03:25:40 +00:00
Rich Graham
b9bb78484d a bit of omptimization.
This commit was SVN r17528.
2008-02-20 16:19:49 +00:00
Rich Graham
09afc36f5f correct addressing.
This commit was SVN r17519.
2008-02-20 01:12:43 +00:00
Rich Graham
b87b15580c fix memory allocation error. Initialize pointer.
This commit was SVN r17514.
2008-02-19 20:01:42 +00:00
Rich Graham
1cd8a2e578 checkpoint - works for 2 procs, but not more.
This commit was SVN r17477.
2008-02-17 05:21:58 +00:00
Rich Graham
8006927ae8 free buffer, rather than ask for another one, when done with the memory.
This commit was SVN r17468.
2008-02-15 04:21:58 +00:00
Rich Graham
2277b47ab9 register mca_coll_sm2_allreduce_intra - function still does not do any
reduction operations.

This commit was SVN r17467.
2008-02-15 04:13:00 +00:00
Rich Graham
9b0687e6df add buffer allocation and deallocation calls to the allreduce routine, so
I can start debugging the memory management code.  The allreduce fucntion
 does nothing at this stage.

This commit was SVN r17466.
2008-02-15 03:59:14 +00:00
Rich Graham
41943dbd76 adding missing files.
This commit was SVN r17462.
2008-02-15 00:59:28 +00:00
Rich Graham
41f4b06b39 buffer allocate/release code is fully written, and compiles. Now need to debug.
This commit was SVN r17461.
2008-02-15 00:57:44 +00:00
Rich Graham
7cc58768cd checkpoint something that compiles
This commit was SVN r17460.
2008-02-15 00:33:14 +00:00
Rich Graham
292d930eea check point.
This commit was SVN r17457.
2008-02-14 20:00:26 +00:00
Edgar Gabriel
77057a50a3 - adding the two-level hierarchy detection algorithm
- minor fix in the temporary collectives 
- removing the symmetric parameter, since it didn't really make sense.

This commit was SVN r17359.
2008-02-01 17:11:36 +00:00
Rich Graham
fda485ff9c backing file is allocated and deallocated.
This commit was SVN r17358.
2008-02-01 15:26:20 +00:00
Rich Graham
165fc3f8cc memory allocation implemented and debugged. Still need to finish
file allocation/dealocation and control information initialization.

This commit was SVN r17291.
2008-01-29 03:09:12 +00:00
Rich Graham
e24c2ebbc0 have a working skeleton for the SM-V2 component. It does nothing at this stage.
This commit was SVN r17241.
2008-01-25 21:16:36 +00:00
Rich Graham
1d0334f4f2 skeleton for new shared memory collective component.
This commit was SVN r17235.
2008-01-25 19:35:26 +00:00
Rich Graham
432ba0cecd add comments about the life-cycle of a collective module.
This commit was SVN r17223.
2008-01-25 03:46:31 +00:00
George Bosilca
31390c0074 We should take in account the extent of the datatype when we compute
the initial displacement in bytes. Thanks to Daniel G. Hyams for the fix.

This commit was SVN r17165.
2008-01-19 05:34:53 +00:00
George Bosilca
3fca3973d3 The PTLs are now long gone !!!
This commit was SVN r17104.
2008-01-10 00:18:45 +00:00
George Bosilca
906e8bf1d1 Replace the ompi_pointer_array with opal_pointer_array. The next step
(sometimes after the merge with the ORTE branch), the opal_pointer_array
will became the only pointer_array implementation (the orte_pointer_array
will be removed).

This commit was SVN r17007.
2007-12-21 06:02:00 +00:00
Jeff Squyres
213b5d5c6e Per long threads on the mailing list and much confusion discussion
about linkers, have all OPAL, ORTE, and OMPI components '''not'' link
against the OPAL, ORTE, or OMPI libraries.

See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for
details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a
better-formatted version of the same info).

This commit was SVN r16968.
2007-12-15 13:32:02 +00:00
Andrew Friedley
c15047b264 Add LLNL copyright to the file i modified yesterday
This commit was SVN r16404.
2007-10-09 15:18:23 +00:00
Andrew Friedley
fd51d9cf28 The call to opal_list_insert() had an off by one error (I think), causing selected components to get lost with certain load orderings.
I went ahead and rewrote the code to use opal_list_insert_pos() instead, which gives a cleaner flow and more speed.

This commit was SVN r16392.
2007-10-08 23:01:36 +00:00
Jeff Squyres
f92d9097d8 Some more changes to update to coll v1.1.0 that were missed
yesterday.  This actually exposed a very, very long-standing bug where
part of the coll base was incorrectly checking the coll API version
against the MCA API version.  When coll went to v1.1 (yesterday) and
was no longer the same as the MCA v1.0, the test started failing.

This commit fixes to check for v1.1 everywhere in the coll base, and
to ensure to check coll framework/API version numbers against coll
framework/API version numbers (vs. against the MCA API version
number).

This commit was SVN r16373.
2007-10-07 12:20:22 +00:00
Jeff Squyres
3d34bff596 No technical/functional changes: simply change the name of the "data"
parameter to "module" everywhere, just to be a little more clear what
the purpose of that parameter is.

This commit was SVN r16372.
2007-10-07 08:36:45 +00:00
Jeff Squyres
fc2b4376e9 Update forgotten macro.
This commit was SVN r16368.
2007-10-06 14:11:35 +00:00
Jelena Pjesivac-Grbovic
ada43fef9e This fixes bug #1157 in coll/self module.
All vector functions had incorrect handling of the offset.

This commit was SVN r16360.
2007-10-05 17:40:16 +00:00
Andrew Friedley
2e66590993 Fix mistakes in the basic component.. can't call collectives on the communicator and always pass the basic module.. have to give them the module off the communicator.
This commit was SVN r16329.
2007-10-04 16:29:24 +00:00
George Bosilca
1e7a791349 Remove some of the problems identified by Coverty.
This commit was SVN r16112.
2007-09-12 20:13:26 +00:00
George Bosilca
c755938eb0 Coverty: release the temporary buffer on error.
This commit was SVN r16104.
2007-09-12 17:45:12 +00:00
Shiqing Fan
a0660f4deb - Just some type casts.
This commit was SVN r16100.
2007-09-12 15:29:58 +00:00
Jeff Squyres
c4a38f47f6 Resolve Coverity CID 467: remove unused variable / dead code.
This commit was SVN r15997.
2007-08-29 01:23:18 +00:00
Edgar Gabriel
a2f5cada1a convert the hiearch component to the new structure. More testing required before we remove the .ompi_ignore flag again.
This commit was SVN r15954.
2007-08-23 20:41:29 +00:00
Shiqing Fan
a497a3fcad - Fix some small bugs, copy-paste mistakes.
This commit was SVN r15941.
2007-08-21 19:57:28 +00:00
Sven Stork
3985a35c35 - export required symbol
This commit was SVN r15939.
2007-08-21 18:46:11 +00:00
Brian Barrett
af4e86c25f Update collectives selection logic to allow for multiple components to be
used at nce (up to one unique collective module per collective function).
Matches r15795:15921 of the tmp/bwb-coll-select branch

This commit was SVN r15924.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r15795
  r15921
2007-08-19 03:37:49 +00:00
Jelena Pjesivac-Grbovic
9bd9c92dbd Making sure that the decision function for scatter and gather correctly
computes everything for MPI_IN_PLACE case.

This commit was SVN r15841.
2007-08-13 17:35:50 +00:00
Jelena Pjesivac-Grbovic
b558e820cb removing compiler wraning
This commit was SVN r15803.
2007-08-08 15:22:01 +00:00
Jelena Pjesivac-Grbovic
daa10b277e modifying scatter decision function to use binomial algorithm for
small message sizes.

This commit was SVN r15798.
2007-08-07 22:16:13 +00:00
Mohamad Chaarawi
59a7bf8a9f Merging in the Sparse Groups..
This commit includes config changes..

This commit was SVN r15764.
2007-08-04 00:41:26 +00:00
Sven Stork
855434de59 - fixes several coverty issues
- add missing initialisation for variables
  - use strncpy instead of strcpy

This commit was SVN r15683.
2007-07-30 14:44:37 +00:00
Jelena Pjesivac-Grbovic
1b66a52c50 Modifying type of binomial tree used for binomial reduce:
switching:
       0                         0
     / \ \                     / \ \
	1    \ \         -->       4   \ \
  /      \ \                 /     \ \
 3       2  \               3       2 \
             4                         1
(duh).  The first form is the bmtree suitable for bcast, but the latter is better for reduce.
Updating default decision function accordingly.

This commit was SVN r15422.
2007-07-13 21:07:51 +00:00
Jelena Pjesivac-Grbovic
d677db9b5f cleaning up alltoall implementation:
- removing MPI_* calls from bruck implementation
- simplifying 2 process case
- identation, etc.

This commit was SVN r15301.
2007-07-07 01:06:19 +00:00
Jelena Pjesivac-Grbovic
483222085e Fixing compiler warnings.
In gather, the ptmp += incr is irrelevant, since ptmp is set within the loop.

This commit was SVN r15293.
2007-07-05 20:40:50 +00:00
Jelena Pjesivac-Grbovic
3b0a52a104 adding tuned allgatherv implementation using bruck, ring, and neighbor-exchange algorithms.
The implementations passed intel and imb tests up to 40 processes.

This commit was SVN r15280.
2007-07-03 23:33:12 +00:00
Jelena Pjesivac-Grbovic
d55b415bb0 fixing typo
This commit was SVN r15240.
2007-06-28 20:56:55 +00:00
Jelena Pjesivac-Grbovic
8fc8b44d11 Modifying reduce decision function for large, single element reduces (again).
Binary algorithm without segmentation tends to outperform binomial algorithm 
in this case.

This commit was SVN r15226.
2007-06-27 22:01:56 +00:00
Jelena Pjesivac-Grbovic
0ecef1750d Modifying the default reduce decision function to use binomial algorithm
for single-element reduce (segmented algorithms make no sense in this case
and can cause performance degradation). 

This commit was SVN r15209.
2007-06-26 20:14:03 +00:00
Jelena Pjesivac-Grbovic
567b40b9a9 Modifying the default broadcast decision function to use binomial algorithm
for single-element broadcasts (segmented algorithms make no sense in this case
and can cause performance degradation).

This commit was SVN r15208.
2007-06-26 20:08:31 +00:00
Jelena Pjesivac-Grbovic
3740640711 Modifying MPI_Gather in tuned module:
- adding linear algorithm with synchronization for gather.
  This algorithm prevents congestion at root process, but introduces 
  synchronization (serializes non-root processes, but allows messages 
  to arrive from two processes at the same time).  
  It performed better than binomial and linear algorithms for large message, 
  and intermediate and large communicator sizes.
- Updating MPI_Gather decision function to reflect performance results
  from MX.  I will perform more measurements though - so this one can 
  change.

This commit was SVN r15165.
2007-06-21 20:00:36 +00:00
Sven Stork
22af6d38e6 - UNexport symbols that shouldn't be needed outside the libraries
- replace #if/#endif with BEGIN/END_C_DECLS
- reformating

This commit was SVN r14669.
2007-05-16 15:46:52 +00:00
Brian Barrett
21e00f6f0c Clean up a couple of configure things:
* Require Autoconf 2.60 or higher and remove some cruft
    required for AC 2.59 or the AC 2.59 / AC 2.60 mix
  * Remove a bunch of now unnecessary AC_SUBST calls
  * Use the libtool-provided variables for the -I and
    library to use when compiling against ltdl

Fixes trac:1000

This commit was SVN r14652.

The following Trac tickets were found above:
  Ticket 1000 --> https://svn.open-mpi.org/trac/ompi/ticket/1000
2007-05-15 04:23:48 +00:00
Jelena Pjesivac-Grbovic
625c6739ab Removing warning about unsed variable
This commit was SVN r14579.
2007-05-03 20:26:41 +00:00
Jelena Pjesivac-Grbovic
9eff74ad4d Modifying generalized reduce "synchronized" behavior:
- Removing "small" message size limit because it really does not relate to the eager size
accross the board.
Now, the leaf nodes in generalized reduce will use blocking send (DEFAULT/ORIGINAL BEHAVIOR) 
either when the maximum number of outstanding requests is 0 or 
when the total number of segments is less than the maximum number of outstanding requests.
Otherwise, it will send messages using non-blocking synchronized send operation.

This commit was SVN r14572.
2007-05-02 21:42:45 +00:00
George Bosilca
69642a9cd4 Remove 2 warnings about ptrdiff_t to unsigned long implicit conversion.
This commit was SVN r14565.
2007-05-01 19:47:33 +00:00
Jelena Pjesivac-Grbovic
3eac49aa59 Adding flow control for leaf nodes in generalized reduce structure.
This "feature" is disabled by default and it should not affect the current performance.

In case when the message size is large and segment size is smaller than eager size for particular interface,
the leaf nodes in generalized reduce function can overflood parent nodes by sending all segments without 
any synchronization.  This can cause the parent to have HIGH number of unexpected messages (think 16MB 
message with 1KB segments for example).  In case of binomial algorithm root node always has at least one
child which is leaf, so this can potentially affect the root's performance significantly [Especially in 
large communicators where root may have quite a few children (binomial tree for example)].
When the segment size is bigger than the eager size, rendezvous protocol ensures that this does 
not happen so it is not necessary.
Originally, the problem was exposed in "infinite" bucket allocator clean up time for "small" segment sizes
(which may explain some "deadlocks" on Thunderbird tests).

To prevent this, we allow user to specify mca parameter "--mca coll_tuned_reduce_algorithm_max_requests NUM"
this limits number of outstanding messages from a leaf node in generalized reduce to the parent to NUM.
Messages are sent as non-blocking synchrnous messages, so syncronization happens at "wait" time.
The synchronization actually improved performance of pipeline and binomial algorithm for large message sizes
with 1KB segments over MX, but I need to test it some more to make sure it is consistent.

Since there is no easy way to find out what is "the eager" size for particular btl, I set the limit to 4000B.
If message/individual segment size is greater than 4000B - we will not use this feature.  This variable may
or may not be exposed as mca parameter later...

I did not have any problems running it and both "default" and "synchronous" tests passed Intel Reduce* tests 
up to 80 processes (over MX).

This commit was SVN r14518.
2007-04-25 20:39:53 +00:00
Jelena Pjesivac-Grbovic
53cbec7a09 Make coll/tuned dynamic rules more verbose (when promted with --mca coll_base_verbose 1)
This commit was SVN r14469.
2007-04-23 16:34:52 +00:00
Jeff Squyres
51f286d737 Just like r14289 on the ORTE trunk:
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).

This commit was SVN r14345.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r14289
2007-04-12 11:19:42 +00:00
George Bosilca
120cf76ad8 Remove some warnings.
This commit was SVN r14196.
2007-04-02 19:11:06 +00:00
George Bosilca
cc65814969 And set the message size before the first use too.
This commit was SVN r14159.
2007-03-28 18:01:13 +00:00
George Bosilca
b540545fa7 Set the communicator size before using it.
This commit was SVN r14158.
2007-03-28 17:59:21 +00:00
Mohamad Chaarawi
bfaf9d4a12 Added new module for intercomm collectives. This will require an
autogen.

This commit was SVN r14149.
2007-03-27 02:06:42 +00:00
Jelena Pjesivac-Grbovic
d6402b6898 Adding in-order binary tree algorithm for non-commutative reduce operations.
I tested algorithm with intel and ibm tests and it passed again - so it should work.

This commit was SVN r14068.
2007-03-19 21:03:57 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Rolf vandeVaart
42168575fd Fix for the special case where np=2 and the sendbuf is set to MPI_IN_PLACE.
In that case, sendcount and sendtype are not valid and we need to use
recvcount and recvtype.

This commit fixes trac:943.  Reviewed by Jelena Pjesivac-Grbovic.

This commit was SVN r14022.

The following Trac tickets were found above:
  Ticket 943 --> https://svn.open-mpi.org/trac/ompi/ticket/943
2007-03-13 19:01:20 +00:00
Jelena Pjesivac-Grbovic
9780a000ba Cleanup of generic reduce function and possible (low probability) bug fix.
- fixing line lengths and some of the comments
- possible bug fix (but I do not think we exposed it in any tests so far)
  temporary buffers were allocated as multiples of extent instead of 
  true_extent + (count -1) * extent.
Everything is still passing Intel tests over tcp and btl mx up to 64 nodes.

This commit was SVN r13956.
2007-03-08 00:54:52 +00:00
Jelena Pjesivac-Grbovic
57cbafafd5 Clean up of generic broadcast function: removing unecessary statements and improving comments.
This commit was SVN r13955.
2007-03-07 21:59:53 +00:00
Jelena Pjesivac-Grbovic
0c07654c30 Updating reduce_scatter decision function based on MX results up to 64 nodes and both 1ppn and 2ppn
configurations.

This commit was SVN r13945.
2007-03-07 00:38:33 +00:00
Jelena Pjesivac-Grbovic
e5ed167a6e Adding tuned version of reduce_scatter implementation.
Currently 3 algorithms are available:
- non-overlapping, reduce + scatterv, (works for non-commutative operations)
- recursive halving algorithm (copied from basic module)
- ring algorithm  (similar to allreduce ring, for large messages)

This commit was SVN r13929.
2007-03-05 20:40:39 +00:00
Li-Ta Lo
196e2a86bb addes binomial tree based scatter, passed IBM and intel tests
This commit was SVN r13906.
2007-03-02 23:19:02 +00:00
Li-Ta Lo
11c94cbe76 eliminated the use of MPI_Get_count
This commit was SVN r13904.
2007-03-02 22:57:50 +00:00
Li-Ta Lo
3765e19d15 added ASCII graph for the topologies
This commit was SVN r13892.
2007-03-02 17:17:14 +00:00
Li-Ta Lo
bd75f2f162 change ALLGATHER to GATHER
This commit was SVN r13891.
2007-03-02 17:02:29 +00:00
Li-Ta Lo
c5d8c221b0 added binomial tree based Gather alogrithm, passed IBM and Intel tests
This commit was SVN r13835.
2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic
627533fe4a Adding segmented ring algorithm for Allreduce for commutative operations.
Algorithm allows user to specify the segment size to be used for computation/communication overlap.
The additional memory requirement for the algorithm is 2 x segment size.
It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests.

This commit was SVN r13832.
2007-02-27 20:32:30 +00:00
George Bosilca
bec20422ee Remove the warnings about printf data-type mismatch.
This commit was SVN r13804.
2007-02-26 22:20:35 +00:00
Li-Ta Lo
c860bd1be5 fixed a typo in the comment
This commit was SVN r13802.
2007-02-26 19:20:46 +00:00
Li-Ta Lo
73a73b1c78 added ASCII graph on reduce_log_intra
This commit was SVN r13801.
2007-02-26 19:15:37 +00:00
Bill D'Amico
db1c2a58c4 Removed cruft - unused variables causing warnings during OMPI build.
This commit was SVN r13772.
2007-02-23 18:55:41 +00:00
Tim Prins
f35f67ed1c (very) minor correction to helpfile
This commit was SVN r13758.
2007-02-22 16:02:12 +00:00
Li-Ta Lo
049921a5ec the temporary buffer is not needed for the MPI_IN_PLACE cases if the underlying Gather is implemented correctly
This commit was SVN r13740.
2007-02-21 20:39:56 +00:00
Jelena Pjesivac-Grbovic
36156f39c2 Modification to allreduce ring algorithm:
- the block sizes are computed in more uniformn way.
  The first k blocks may be 1 element larger than the remaining blocks.
The algorithm passed Intel Allreduce_c and Allreduce_loc_c tests, and 
IMB-3.2 Allreduce, over TCP and both btl and mtl MX (up to 128 processes).
The algorithm still only supports commutative operations.

This commit was SVN r13738.
2007-02-21 19:30:08 +00:00
Jelena Pjesivac-Grbovic
b608887466 Adding variant of linear alltoall algorithm where the number of
outstanding requests can be limited using mca parameters.
The implementation passed Intel, IMB-3.2, and mpi_test_suite tests over
TCP and MX up to 128 processes (64 nodes), on both 32-bit and 64-bit machines.
It is not activated by default, but it should be useful for really large
communicator sizes.

This commit was SVN r13720.
2007-02-20 04:25:00 +00:00
Jelena Pjesivac-Grbovic
d2d02642ca Removing compilation warnings about the output format.
This commit was SVN r13693.
2007-02-16 23:32:47 +00:00
Jelena Pjesivac-Grbovic
e532b928af Adding segmented binary reduce algorithm which works with non-commutative operations.
Implementation passed intel: MPI_Reduce_c , MPI_Reduce_loc_c, and MPI_Reduce_user_c tests
over TCP, BTL MX, and MTL MX, as well as, mpi_test_suite Reduce tests (up to 64 nodes).

The algorithm is still not activated by decision function (will be in the near future).

This commit was SVN r13657.
2007-02-14 22:38:38 +00:00
Jelena Pjesivac-Grbovic
b52dc9e427 Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes.
This commit was SVN r13597.
2007-02-10 00:31:10 +00:00
Jelena Pjesivac-Grbovic
6efca498ec Fixes trac:692 in trunk: receive buffer in MPI_Reduce operation is no longer overwritten on non-root nodes.
This commit was SVN r13538.

The following Trac tickets were found above:
  Ticket 692 --> https://svn.open-mpi.org/trac/ompi/ticket/692
2007-02-07 18:57:03 +00:00
Jeff Squyres
c91fcd7fbd Fix a bunch of minor typos submitted by Bernhard Fischer.
This commit was SVN r13505.
2007-02-06 12:00:30 +00:00
Jelena Pjesivac-Grbovic
e193d625bc Bugfix for ring allreduce algorithm.
The step used to iterate through buffer was function of true_extent instead of extent.

This may or may not solve ticket #689 because I am still getting failures over btl mx, 
but I cannot reproduce failures over mtl mx nor tcp.

This commit was SVN r13459.
2007-02-02 02:44:16 +00:00
Brian Barrett
93a2f31932 Use a recursive halving communication algorithm similar to the one used by
MPICH2 for "small" commutative operations in the reduce_scatter basic
implementation.  "small" is currently pretty big, as it doesn't take
much to beat reduce/scatterv.  Need to do much more than this for
better all around performance of MPI_Reduce_scatter, but this was enough
to solve the problems I was having.

This commit was SVN r13348.
2007-01-29 19:29:35 +00:00
Jelena Pjesivac-Grbovic
33dcb4f810 Minor change to linear alltoall algorithm:
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should 
minimize the time spent in matching the requests.

I did not see any performance difference over MX up to 64 nodes, but 
the change makes sense and may have some impact when we have (many) 
more nodes.

This commit was SVN r13337.
2007-01-26 21:59:31 +00:00
George Bosilca
6f720f0d26 Add all required explicit conversions in order to be able
to build on Windows.

This commit was SVN r13264.
2007-01-24 00:48:16 +00:00
Jelena Pjesivac-Grbovic
5cbcf42dc3 Removing yet another unsed variable (missed it in previous submit).
This commit was SVN r13259.
2007-01-23 21:30:57 +00:00
Jelena Pjesivac-Grbovic
afbd032ff9 Removing compiler warnings about comparison of unsigned values to signed ones, and
unused variables.

This commit was SVN r13258.
2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic
568477ade8 Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util.
- Allreduce algorithms:
  - Recursive doubling is used for small messages (up to 10KB) and can be used for 
    both commutative and non-commutative operations.  
	 Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
	 Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with 
	 MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
  - Ring algorithms performs well for larger messages but cannot be used for 
    non-commutative operations.  It passed the same tests as recursive doubling, except
	 some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
	 (which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util.  Moving isendrecv to static inline just like sendrecv. 

This commit was SVN r13252.
2007-01-23 01:19:11 +00:00
George Bosilca
242292673a sendrecv is a static inline.
This commit was SVN r13237.
2007-01-22 05:50:23 +00:00
Sven Stork
862dcb1a34 - fix compiler warning in ia64
This commit was SVN r13212.
2007-01-19 14:48:47 +00:00
Jelena Pjesivac-Grbovic
85192c01b0 Modifying util functionality:
- removing static qualification on ompi_coll_tuned_sendrecv 
- adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests
These changes are separate from but necessary for new algorithms I am working on.

This commit was SVN r13161.
2007-01-17 21:29:13 +00:00
Jelena Pjesivac-Grbovic
d2921a9d42 Cleanup of Barrier implementation:
- utilizing coll_tuned_util functions
- setting line length to 80.

This implementation uses standard send messages (instead of synchronous ones).
The change improved our performance over MX multiple number of times, however,
there exists a small potential that last message to be sent can be delayed 
(until next mpi call, which means potentially infinitely).

If this shows to be a problem, I will modify the algorithms to use synchronous
send as last operation (which will incur performance penalty again).

This commit was SVN r13071.
2007-01-10 22:49:43 +00:00
Jelena Pjesivac-Grbovic
ccc3ee0b6b Minor changes to allgather implementation with some clean-up of util code.
- in allgather algorithms I replaces irecv-isend-waitall sequence with 
  call to ompi_coll_tuned_sendrecv
- most of the functions in util code and allgather decision function conform to 80 character line width.
- 

This commit was SVN r13069.
2007-01-10 21:56:59 +00:00
Brian Barrett
a34e67d743 Remove unneeded PARAM_INIT_FILE variable in configure.params files used by
components that use configure.m4 for configuration or are always built. 
The macro has not been needed since moving to configure types other than
configure.stub

Fixes trac:590

This commit was SVN r13031.

The following Trac tickets were found above:
  Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
2007-01-08 03:44:22 +00:00
Jelena Pjesivac-Grbovic
eae3df4904 Updated broadcast decision function based on MX results up to 64 nodes.
(The previous decision function did not consider binomial algorithm (since we did not have it at the time)).

This commit was SVN r13007.
2007-01-06 00:37:40 +00:00
Brian Barrett
936fdd2ae1 remove some code that accidently came in with r12974. Refs trac:587
This commit was SVN r12991.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-04 20:17:07 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Jelena Pjesivac-Grbovic
3494e1bb05 - Updated decision function for Alltoall collective.
Fixes "jump" for intermediate sizes message on 24+ number of nodes
    (at least on Grig cluster).

This commit was SVN r12920.
2006-12-22 19:59:17 +00:00
George Bosilca
b1725e02d4 No more warnings plus some code reordering.
This commit was SVN r12919.
2006-12-21 22:42:15 +00:00
Jelena Pjesivac-Grbovic
f1aec23507 Adding tuned allgather implementation.
It contains four algorithms: 
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).

All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at 
the University of Tennessee at Knoxville.  
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).

This commit was SVN r12910.
2006-12-21 18:40:02 +00:00
Brian Barrett
6f8b366acb Rename liborte to libopen-rte and libopal to libopen-pal per telecon today
and bug #632.

Refs trac:632

This commit was SVN r12762.

The following Trac tickets were found above:
  Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632
2006-12-05 18:27:24 +00:00
Ralph Castain
6d6cebb4a7 Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things).
Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it.

I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn).

This commit was SVN r12597.
2006-11-14 19:34:59 +00:00
George Bosilca
ec410644ce Implement the send receive as 2 non blocking operations. That will help us
avoiding too many calls to opal_progress.

This commit was SVN r12553.
2006-11-10 23:06:19 +00:00
George Bosilca
c2c6a1b37e Correctly compute the number of elements in a segment.
For broadcast send the correct size for all intermediary nodes.

This commit was SVN r12552.
2006-11-10 23:04:50 +00:00
George Bosilca
7102147b9f Correctly detect when the specified algorithm is out of range. In
this case we reset it to zero.

This commit was SVN r12551.
2006-11-10 21:47:07 +00:00
George Bosilca
af68171253 Use the macro to compute the number of elements in a segment in both
bcast and reduce and update the default values for the variables
as required by the comment in the coll_tuned.h file.

This commit was SVN r12546.
2006-11-10 20:04:08 +00:00
George Bosilca
476b922074 Updates & upgrades:
- consistent arguments checking (not allowing to select an algorithm which
     is not available)
 - consistent way of computing the segcount (number of datatypes by segment).
 - small cleanups.
 - more informative debugging messages.

This commit was SVN r12545.
2006-11-10 19:54:09 +00:00
George Bosilca
77ef979457 New architecture for broadcast. A generic broadcast working on a tree
description. Most of the bcast algorithms can be completed using this
generic function once we create the tree structure. Add all kind of
trees.

There are 2 versions of the generic bcast function. One using overlapping
between receives (for intermediary nodes) and then blocking sends to all
childs and another where all sends are non blocking. I still have to
figure out which one give the smallest overhead.

This commit was SVN r12530.
2006-11-10 05:53:50 +00:00