1
1
Граф коммитов

445 Коммитов

Автор SHA1 Сообщение Дата
Rich Graham
daf7673aff gather socket information - not debugged.`
This commit was SVN r20670.
2009-03-02 10:58:12 +00:00
Rainer Keller
96e1b9b747 - Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml
or ORTE_RML.
   As the others compiles fine with -Wimplicit-function-declaration

This commit was SVN r20639.
2009-02-26 03:52:31 +00:00
Terry Dontje
0178b6c45f Added padding to predefined handle structures to maintain library version to
version compatibility.

This commit was SVN r20627.
2009-02-24 17:17:33 +00:00
Shiqing Fan
2148220ce4 Update the share libs dependency for windows build.
This commit was SVN r20625.
2009-02-23 17:49:46 +00:00
Jeff Squyres
3742c3550c Add "sync" collective component. This component is totally
deactivated by default.  It is activated by setting either of the
following two MCA parameters to values greater than 0:

 * coll_sync_barrier_before
 * coll_sync_barrier_after

If !_before is >0, then the sync coll collective will insert itself
before the underlying collective operations and invoke a barrier
before every Nth barrier (N == coll_sync_barrier_before).  Similar for
!_after.  Note that N is a _per communicator_ value; not global to the
MPI process.

If both are 0 (which is the default), this component returns NULL for
the comm query, meaning that it is not insertted into the coll module
stack. 

The intent of this component is to provide a a workaround for
applications with large numbers of collectives of short messages that
can cause unbounded unexpected messages.  Specifically, it is possible
for some iterative collective communication patterns to cause
unbounded unexpected messages.  Forcing a barrier before or after
every Nth collective operation would prevent that behavior by forcing
applications to synchronize (and thereby consume any outstanding
unexpected messages caused by collectives on the same communicator).

Open MPI still needs to bound unexpected messages resource consumption
at the receiver, but this is a viable workaround for at least some
symptoms of the problem.

Additionally, there has been anecdotal evidence of some applications
that "perfom better" when they put barriers after other collective
operations.  This could be due to many factors -- including shortening
the unexpected message queue.  Putting this component in Open MPI
allows people to try this with their own applications and give real
world feedback on this kind of behavior.

This commit was SVN r20584.
2009-02-18 23:32:44 +00:00
Rainer Keller
d81443cc5a - On the way to get the BTLs split out and lessen dependency on orte:
Often, orte/util/show_help.h is included, although no functionality
   is required -- instead, most often opal_output.h, or               
   orte/mca/rml/rml_types.h                                           
   Please see orte_show_help_replacement.sh commited next.            

 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   actually showed two *missing* #include "orte/util/show_help.h"     
   in orte/mca/odls/base/odls_base_default_fns.c and                  
   in orte/tools/orte-top/orte-top.c                                  
   Manually added these.                                              

   Let's have MTT the last word.

This commit was SVN r20557.
2009-02-14 02:26:12 +00:00
Jeff Squyres
8b29e27ead Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20543.
2009-02-13 03:45:32 +00:00
Tim Mattox
9b83df22ec Fix some "is proc on local node?" logic that got accidentally flipped
by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules.

This commit was SVN r20515.

The following SVN revision numbers were found above:
  r20496 --> open-mpi/ompi@4cdf91a8d4
2009-02-11 15:02:38 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Jeff Squyres
4d8a187450 Two major things in this commit:
* New "op" MPI layer framework
 * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2)

= Op framework =

Add new "op" framework in the ompi layer.  This framework replaces the
hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples
for pre-defined MPI_Ops, allowing components and modules to provide
the back-end functions.  The intent is that components can be written
to take advantage of hardware acceleration (GPU, FPGA, specialized CPU
instructions, etc.).  Similar to other frameworks, components are
intended to be able to discover at run-time if they can be used, and
if so, elect themselves to be selected (or disqualify themselves from
selection if they cannot run).  If specialized hardware is not
available, there is a default set of functions that will automatically
be used.

This framework is ''not'' used for user-defined MPI_Ops.

The new op framework is similar to the existing coll framework, in
that the final set of function pointers that are used on any given
intrinsic MPI_Op can be a mixed bag of function pointers, potentially
coming from multiple different op modules.  This allows for hardware
that only supports some of the operations, not all of them (e.g., a
GPU that only supports single-precision operations).

All the hard-coded back-end MPI_Op functions for (MPI_Op,
MPI_Datatype) tuples still exist, but unlike coll, they're in the
framework base (vs. being in a separate "basic" component) and are
automatically used if no component is found at runtime that provides a
module with the necessary function pointers.

There is an "example" op component that will hopefully be useful to
those writing meaningful op components.  It is currently
.ompi_ignore'd so that it doesn't impinge on other developers (it's
somewhat chatty in terms of opal_output() so that you can tell when
its functions have been invoked).  See the README file in the example
op component directory.  Developers of new op components are
encouraged to look at the following wiki pages:

  https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework

= MPI_REDUCE_LOCAL =

Part of the MPI-2.2 proposal listed here:

    https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24

is to add a new function named MPI_REDUCE_LOCAL.  It is very easy to
implement, so I added it (also because it makes testing the op
framework pretty easy -- you can do it in serial rather than via
parallel reductions).  There's even a man page!

This commit was SVN r20280.
2009-01-14 23:44:31 +00:00
Edgar Gabriel
1072812bcf not every element in the pointer array list contains a valid entry. Thus, do not try to free elements if the list returns NULL.
This commit was SVN r20275.
2009-01-14 19:11:30 +00:00
George Bosilca
01adc999c5 Correctly forward the right module if we call another collective function. Kudos to
Edgar for figuring out this tricky bug.

This commit was SVN r20267.
2009-01-14 03:22:54 +00:00
Jeff Squyres
11b375f8b5 CIDs 1080-1090: assert() checks were not sufficient to check for
NEGATIVE_RETURNS from _reg_int() because those are not always
checked.  So replace them with real if() checks.

This commit was SVN r20195.
2009-01-03 15:56:25 +00:00
Jeff Squyres
f13ea32830 Remove the code checkig the MCA "coll" parameter for a list of coll
components to use.  This code was rendered obsolete (albiet harmless)
by the MCA base improvements that only open the components that were
specified by each framework's MCA parameter.

This commit was SVN r20176.
2008-12-31 13:40:51 +00:00
Jeff Squyres
759a295cc9 Gaah -- missed one s/m/component/g
This commit was SVN r20175.
2008-12-31 13:35:37 +00:00
Jeff Squyres
955d1e132d Rename a variable to be "component" (not "m"), to emphasize that it is
the component struct, not a module.

This commit was SVN r20174.
2008-12-31 13:32:46 +00:00
Jeff Squyres
865900dd27 Nothing of substance; just indenting changes (''finally'' update this
framework base to 4 space tabs!).

This commit was SVN r20173.
2008-12-31 12:17:08 +00:00
Jeff Squyres
ce313fa391 Minor fixes to a few comments
This commit was SVN r20172.
2008-12-31 11:34:27 +00:00
Jeff Squyres
d533215dac Fix a comment to reflect the right version number
This commit was SVN r20169.
2008-12-30 12:39:32 +00:00
Nysal Jan
ee8ec6f6b5 Remove dead/redundant code. Minimize number of calloc invocations
This commit was SVN r20121.
2008-12-12 10:55:50 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Rolf vandeVaart
137729d2f9 Fix warnings (thanks Jeff) from previous fix. This is extra
fix for ticket #1554.

This commit was SVN r19728.
2008-10-10 14:35:52 +00:00
Tim Mattox
de623ea161 Remove a redundant if & goto.
This commit was SVN r19724.
2008-10-09 15:07:56 +00:00
Rolf vandeVaart
aad4427caa Fix the implementation of MPI_Reduce_scatter on intercommunicators.
We still do an interreduce but it is now followed by an intrascatterv.

This fixes trac:1554.

This commit was SVN r19723.

The following Trac tickets were found above:
  Ticket 1554 --> https://svn.open-mpi.org/trac/ompi/ticket/1554
2008-10-09 14:35:20 +00:00
Rolf vandeVaart
13e8975f83 In the case where we detect a value of 0 in the recvcount
array, fall back to the simpler algorithms.  This is not
the optimal solution, but it works.  

This commit was SVN r19702.
2008-10-07 19:44:51 +00:00
Rolf vandeVaart
0a0ddfc934 Handle MPI_IN_PLACE correctly in the ompi_coll_tuned_reduce_scatter_intra_ring function.
We were not adjusting the sendbuf in this case so we were reducing garbage.

This fixes ticket #1506.

This commit was SVN r19673.
2008-10-02 20:01:27 +00:00
George Bosilca
325d006577 Mostly cleanups, and eventually a little bit more scalable add_procs.
There was an argument that was barely used, and on return at the PML
level it contained nothing usable. It has been removed, so now we're
using less memory ...

This commit was SVN r19657.
2008-09-30 15:47:43 +00:00
George Bosilca
6a9514ee08 Make the code match the comment. I checked with Jelena, and based on the papers we
published this is the expected algorithm for the specified message and communicator
size.

This commit closes ticket #1330.

This commit was SVN r19563.
2008-09-15 23:28:40 +00:00
Edgar Gabriel
ef2bb46e45 no need to create and free the groups. We just want to translate the ranks and we can use the internal group structures right away for that operation. Fixes an issue with groups that have not been freed previously, due to the fact that ompi_group_free was not visible here (I know, this could have been solved also by setting OMPI_DECLSPEC on ompi_group_free, but this solution should be faster.)
This commit was SVN r19362.
2008-08-19 13:59:58 +00:00
Edgar Gabriel
149ecb8d7d 1. debug the four new algorithms
2. fix a bug in the initial communicator creation of llcomm
3. fix a bug which showed up as the result of fixing issue number 2: we have
   to check now whether llcomm has really be created before freeing the 
   according llcomm in hierarch_destruct.

This commit was SVN r19361.
2008-08-18 21:54:35 +00:00
Edgar Gabriel
7cbc4a4077 adding four different algorithms for a hierarchical bcast which try to
generate an overlap between the different layers. Why four versions? Because
there is right now always the trade-off between using non-blocking operations
on a layer with a trivial, linear algorithm and using the more sophisticaed
algorithms in a blocking manner. 

- bcast_intra_seg used the bcast of lcomm and llcomm, similarly
  to original algorithm in hierarch. However, it can segment
  the message, such that we might get an overlap between the two
  layers. This overlap is based on the assumption, that a process
  might be done early with a bcast and can start the next one.
- bcast_intra_seg1: replaces the llcomm->bcast by isend/irecvs
  to increase the overlap, keeps the lcomm->bcast however
- bcast_intra_seg2: replaced lcomm->bcast by isend/irecvs
  to increase the overlap, keeps however llcomm->bcast
- bcast_intra_seg3: replaced both lcomm->bcast and llcomm->bcast
  by isend/irecvs

The code is lightly tested, more testing to follow right now.

This commit was SVN r19358.
2008-08-18 16:05:44 +00:00
George Bosilca
a6e3a47102 Fix typo.
This commit was SVN r19312.
2008-08-17 20:08:38 +00:00
Rich Graham
e64f028d62 add missing header file for errno.
This commit was SVN r19246.
2008-08-12 01:34:13 +00:00
Jeff Squyres
54ab811426 Fix CID 1036: minor resource leak on error
This commit was SVN r19236.
2008-08-11 20:37:36 +00:00
Rainer Keller
ee1fe9015a - Make sure, that the *param_index are > 0 (here, we don't pass
errors up...).
   Coverity CID 1080 - 1090
 - Really make sure, the user does not specify stupid negative values.

This commit was SVN r19233.
2008-08-11 11:21:04 +00:00
George Bosilca
3c8d43deed Remove unused variable (Coverty fix 178).
This commit was SVN r19195.
2008-08-06 14:09:43 +00:00
George Bosilca
567c691354 Remove unused variable (Coverty fix 177).
This commit was SVN r19194.
2008-08-06 14:08:34 +00:00
George Bosilca
f6ebdf8896 Remove unused variable (Coverty fix 176).
This commit was SVN r19193.
2008-08-06 14:07:20 +00:00
George Bosilca
c021427002 Remove unused variable (Coverty fix 175).
This commit was SVN r19192.
2008-08-06 14:06:08 +00:00
George Bosilca
6c8017e9b7 Remove unused variable (Coverty fix 174).
This commit was SVN r19191.
2008-08-06 14:04:54 +00:00
George Bosilca
afc79d1651 Remove unused variable (Coverty fix 173).
This commit was SVN r19190.
2008-08-06 14:03:33 +00:00
George Bosilca
5e3a5b7c13 Remove unused variable (Coverty fix 172).
This commit was SVN r19188.
2008-08-06 14:01:33 +00:00
George Bosilca
d897710e4f Remove unused variable (Coverty fix 171).
This commit was SVN r19187.
2008-08-06 14:00:22 +00:00
George Bosilca
417b727006 Remove unused variable (Coverty fix 170).
This commit was SVN r19186.
2008-08-06 13:59:03 +00:00
George Bosilca
4f91b7806c Remove unused variable (Coverty fix 169).
This commit was SVN r19185.
2008-08-06 13:57:43 +00:00
Rainer Keller
23c2292478 - Fix variable set but not used
Coverity CID1058

This commit was SVN r19184.
2008-08-06 13:57:38 +00:00
Jeff Squyres
0af7ac53f2 Fixes trac:1392, #1400
* add "register" function to mca_base_component_t
   * converted coll:basic and paffinity:linux and paffinity:solaris to
     use this function
   * we'll convert the rest over time (I'll file a ticket once all
     this is committed)
 * add 32 bytes of "reserved" space to the end of mca_base_component_t
   and mca_base_component_data_2_0_0_t to make future upgrades
   [slightly] easier
   * new mca_base_component_t size: 196 bytes
   * new mca_base_component_data_2_0_0_t size: 36 bytes
 * MCA base version bumped to v2.0
   * '''We now refuse to load components that are not MCA v2.0.x'''
 * all MCA frameworks versions bumped to v2.0
 * be a little more explicit about version numbers in the MCA base
   * add big comment in mca.h about versioning philosophy

This commit was SVN r19073.

The following Trac tickets were found above:
  Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392
2008-07-28 22:40:57 +00:00
Jeff Squyres
d37a25a2d0 Remove per http://www.open-mpi.org/community/lists/devel/2008/07/4386.php
This commit was SVN r18972.
2008-07-22 00:57:23 +00:00
Edgar Gabriel
798f47b430 Fixes ticket #1334
hierarch disables itself now if the pml module used is *not* ob1. The reason
is, that the multi-level hierarchy detection algorithm checks the names of the
btl modules used. In case there are no btl's, we would segfault.

Furthermore, three minor changes:
 - the 2-level hierarchy detection is now the default (sm vs. everything else
 in the world).
 - add udapl to the list of protocols checked for by the multi-level hierarch detection
 - some of the verbose statements of hierarch were inaccurate. Fixed those comments/messages.

This commit was SVN r18817.
2008-07-07 18:44:48 +00:00
Ralph Castain
9613b3176c Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.

I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.

This commit was SVN r18619.
2008-06-09 14:53:58 +00:00