1
1
Граф коммитов

2528 Коммитов

Автор SHA1 Сообщение Дата
Tim Prins
74555cda51 - Re-enable MPI_Comm_spawn_multiple from singletons. It has been working for a while, but the check was never removed.
- Coding standardize some code
- Remove now unused help message

This commit was SVN r13858.
2007-02-28 22:09:30 +00:00
Tim Mattox
ec82d01555 Add a missing extern keyword that prevented compilation on OS X.
This commit was SVN r13853.
2007-02-28 20:26:34 +00:00
Gleb Natapov
2b6cbd6299 Separate frag lists for RDMA descriptors to two, one for src descriptors
and another for dst descriptors. This provide partial solution to OB1 protocol
deadlock problem. We can limit number of RDMA descriptors (by setting
btl_openib_free_list_max to something different from -1) and if we will be
lucky to hit this limit before we fail to register more memory the protocol
will not deadlock. When we had only one list for src/dst descriptors we
deadlocked when we reached max limit for the list.

This commit was SVN r13844.
2007-02-28 13:43:38 +00:00
Sven Stork
870740efe2 - proper export symbols that are required by other components.
This commit was SVN r13841.
2007-02-28 12:51:55 +00:00
Rainer Keller
0889ebd59f - Eliminate warnings, that PGI-6.2.5 issues with -Minform=inform
This commit was SVN r13840.
2007-02-28 08:36:34 +00:00
Li-Ta Lo
c5d8c221b0 added binomial tree based Gather alogrithm, passed IBM and Intel tests
This commit was SVN r13835.
2007-02-28 01:11:01 +00:00
Jelena Pjesivac-Grbovic
627533fe4a Adding segmented ring algorithm for Allreduce for commutative operations.
Algorithm allows user to specify the segment size to be used for computation/communication overlap.
The additional memory requirement for the algorithm is 2 x segment size.
It performed well for (really) large message sizes over MX and it passed intel Allreduce_c and Allreduce_loc_c tests.

This commit was SVN r13832.
2007-02-27 20:32:30 +00:00
Jeff Squyres
38c976d527 Remove redundant declaration of ompi_err_unknown.
This commit was SVN r13829.
2007-02-27 19:37:42 +00:00
Sven Stork
d8a369936e - Fix more symbols that should be exported.
This commit was SVN r13824.
2007-02-27 15:17:17 +00:00
George Bosilca
533dfff56d Only do the preconnection stage if we found the local proc. It's mostly to
make some compilers complain less about uninitialized values.

This commit was SVN r13805.
2007-02-26 22:24:44 +00:00
George Bosilca
bec20422ee Remove the warnings about printf data-type mismatch.
This commit was SVN r13804.
2007-02-26 22:20:35 +00:00
Brian Barrett
6d70f5fbe0 don't define malloc and friends in opal_config, as it causes problems when
we later include malloc.h

This commit was SVN r13803.
2007-02-26 21:34:48 +00:00
Li-Ta Lo
c860bd1be5 fixed a typo in the comment
This commit was SVN r13802.
2007-02-26 19:20:46 +00:00
Li-Ta Lo
73a73b1c78 added ASCII graph on reduce_log_intra
This commit was SVN r13801.
2007-02-26 19:15:37 +00:00
Pavel Shamis
6fe84f581b mpool_base_module_destroy was removing all modules from
a list instead of removing specific one. Fixing the bug.

This commit was SVN r13795.
2007-02-26 16:25:20 +00:00
Brian Barrett
d9e0e80190 Make some debugging output only looked at when debugging is enabled
This commit was SVN r13777.
2007-02-25 01:03:19 +00:00
Bill D'Amico
db1c2a58c4 Removed cruft - unused variables causing warnings during OMPI build.
This commit was SVN r13772.
2007-02-23 18:55:41 +00:00
Tim Prins
dbe82c70d6 Get rid of stale file, and remove the (unused) references to it.
This commit was SVN r13771.
2007-02-23 13:50:39 +00:00
Tim Prins
bddd06bcdb pedantic formatting...
This commit was SVN r13766.
2007-02-23 00:54:41 +00:00
Tim Prins
f35f67ed1c (very) minor correction to helpfile
This commit was SVN r13758.
2007-02-22 16:02:12 +00:00
Ron Brightwell
e15e85a0b6 Fix a problem with long unexpected messages that was causing hangs.
Long unexpected messages were not generating PUT_START events
because the MD for long unexpected messages was configured to
ignore start events.  When a long unexpected message arrived, it
traversed the match list, and ended up in the long unexpected MD.
As the long message is being consumed, the code called PtlMDUpdate()
to look for the message, but there was no event that indicated
that it had arrived. So, the update succeeded.  Once the long
unexpected message was consumed, the PUT_END event showed up in the
event queue -- except the code wasn't looking for it anymore.
The PUT_START events exist specifically to handle ordering between
short and long unexpected messages, so PUT_START events can't be
ignored on long unexpected messages.

Modified the code to generate PUT_START events for both long and
short unexpected messages and handle matching up START and END
events appropriately.

This commit was SVN r13746.
2007-02-21 21:59:48 +00:00
Li-Ta Lo
049921a5ec the temporary buffer is not needed for the MPI_IN_PLACE cases if the underlying Gather is implemented correctly
This commit was SVN r13740.
2007-02-21 20:39:56 +00:00
Jelena Pjesivac-Grbovic
36156f39c2 Modification to allreduce ring algorithm:
- the block sizes are computed in more uniformn way.
  The first k blocks may be 1 element larger than the remaining blocks.
The algorithm passed Intel Allreduce_c and Allreduce_loc_c tests, and 
IMB-3.2 Allreduce, over TCP and both btl and mtl MX (up to 128 processes).
The algorithm still only supports commutative operations.

This commit was SVN r13738.
2007-02-21 19:30:08 +00:00
Josh Hursey
c573171b7d Mostly a cleanup commit.
- Implement the BML/r2 finialize funciton
- Cleanup the btl close routine
- Wire up a pml_base_verbose MCA parameter so you can actually watch the PML selection logic if you really want to.
- Fix a potental segfault in the selection logic.
  ompi_pointer_array_get_item() may return NULL, so we have to check for it

This commit was SVN r13734.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2007-02-21 16:18:43 +00:00
Jelena Pjesivac-Grbovic
b608887466 Adding variant of linear alltoall algorithm where the number of
outstanding requests can be limited using mca parameters.
The implementation passed Intel, IMB-3.2, and mpi_test_suite tests over
TCP and MX up to 128 processes (64 nodes), on both 32-bit and 64-bit machines.
It is not activated by default, but it should be useful for really large
communicator sizes.

This commit was SVN r13720.
2007-02-20 04:25:00 +00:00
Jeff Squyres
f820e44112 Remove a gcc-ism from the code (defining an anonymous union in the
middle of a struct).  Now we properly define and name the union
outside the struct and simply create an instance of it inside the
struct. 

This commit was SVN r13709.
2007-02-19 18:21:57 +00:00
Brian Barrett
727f64aecf It appears that SEND_COMPLETE on a 0 byte message with BTLs that don't
support SEND_IN_PLACE causes badness because the BTL tries to use the
not-exactly-complete convertor.  Don't need it in this situation anyway.

This commit was SVN r13700.
2007-02-19 02:43:26 +00:00
George Bosilca
020b8ade70 A slightly better fix for the data mismatch compiler complaints.
This commit was SVN r13695.
2007-02-17 05:23:57 +00:00
Jelena Pjesivac-Grbovic
d2d02642ca Removing compilation warnings about the output format.
This commit was SVN r13693.
2007-02-16 23:32:47 +00:00
Rich Graham
b925d6588d add some missing error checking - thanks to Ron B.
This commit was SVN r13692.
2007-02-16 22:19:24 +00:00
George Bosilca
04138c23af No more warnings.
This commit was SVN r13683.
2007-02-16 16:25:58 +00:00
Pavel Shamis
edeab0e912 Adding Mellanox Technologies copyright to files touched by Mellanox.
This commit was SVN r13669.
2007-02-15 18:03:20 +00:00
Jeff Squyres
b7b893b771 Need to EXTRA_DIST README.txt so that it gets included in the tarball;
the *_DATA files are not automatically picked up for inclusion into
the tarball.

This commit was SVN r13667.
2007-02-15 16:21:25 +00:00
Jelena Pjesivac-Grbovic
e532b928af Adding segmented binary reduce algorithm which works with non-commutative operations.
Implementation passed intel: MPI_Reduce_c , MPI_Reduce_loc_c, and MPI_Reduce_user_c tests
over TCP, BTL MX, and MTL MX, as well as, mpi_test_suite Reduce tests (up to 64 nodes).

The algorithm is still not activated by decision function (will be in the near future).

This commit was SVN r13657.
2007-02-14 22:38:38 +00:00
Pavel Shamis
2483cefc57 Additional check if descriptor is NULL. It prevents
mca_pml_dr_sendreq_cleanup_active failure on segfault.

This commit was SVN r13647.
2007-02-14 10:43:43 +00:00
Brian Barrett
c00d841741 Fix hang on Cray machine introduced with r13582. The modex will never fire
when on the Cray machine (aka when the NULL GPR is in use).

This commit was SVN r13638.

The following SVN revision numbers were found above:
  r13582 --> open-mpi/ompi@041beeb1b6
2007-02-13 18:34:03 +00:00
Gleb Natapov
4d4b0a022a Add error callback to sm BTL. Call it when allocation of the initial circular
buffer fails. If cb is already allocated, but it is full and allocation of
additional cb fails, we spin waiting for receiver to free space in existing
cb.

This commit was SVN r13635.
2007-02-13 12:01:36 +00:00
George Bosilca
2e042c91cf Once we compute the local offset use it (instead of the global one).
This commit was SVN r13634.
2007-02-13 09:34:04 +00:00
George Bosilca
22eca30b45 One less compiler warning.
This commit was SVN r13633.
2007-02-13 09:32:57 +00:00
George Bosilca
7b7fecad85 More output when position_debug is enabled.
This commit was SVN r13631.
2007-02-13 09:28:39 +00:00
George Bosilca
5214e3751b Correctly handle the pack and unpack for contiguous types with gaps. Th
pack/unpack let the convertor in a consistent state, such that the next operation
will succeed.

This commit was SVN r13630.
2007-02-13 09:28:05 +00:00
Jeff Squyres
dd35fb73ff * Fix some "MPI:Exception" typos (needs 2 :'s)
* Update exactly how we handle MPI exceptions, particularly with
   respect to MPI-1 section 3.2.5, and how error handlers are only
   invoked for the ''first'' request that generates an exception.
 * Update the "see also" section to be consistent across all 8
   MPI_Test* and MPI_Wait* functions.
 * Fixes trac:560

This commit was SVN r13619.

The following Trac tickets were found above:
  Ticket 560 --> https://svn.open-mpi.org/trac/ompi/ticket/560
2007-02-12 18:08:42 +00:00
Gleb Natapov
1033002595 Fix memory leak. Free allocated descriptor if operation cannot proceed.
This commit was SVN r13610.
2007-02-12 09:47:51 +00:00
Jelena Pjesivac-Grbovic
b52dc9e427 Modifying fixed decision function for reduce to utilize linear algorithm only for really small communicator sizes.
This commit was SVN r13597.
2007-02-10 00:31:10 +00:00
Brian Barrett
8b28e5b33d Allow the OOB to connect between all MPI applications during MPI_INIT
without also establishing MPI connectivity. 

This commit was SVN r13595.
2007-02-09 20:17:37 +00:00
Brian Barrett
262cbbc5c9 Back out r13593, which contained a change that shouldn't be committed.
This commit was SVN r13594.

The following SVN revision numbers were found above:
  r13593 --> open-mpi/ompi@81472363ea
2007-02-09 20:13:02 +00:00
Brian Barrett
81472363ea Allow the OOB to connect between all MPI applications during MPI_INIT
without also establishing MPI connectivity.

This commit was SVN r13593.
2007-02-09 20:11:40 +00:00
Brian Barrett
041beeb1b6 Share currently selected PML in the modex information, then check whenever
adding new procs that the remote proc's pml is the same as our local pml.
Turns the hangs from mismatched PMLs into an abort, which is better,
I think.

This commit was SVN r13582.
2007-02-09 16:38:16 +00:00
Galen Shipman
f98a442c82 Fix a problem in the selection logic for MX. Basically we need to be able to
open MTL MX and BTL MX and initialize them at the same time. The problem is
that both call mx_init and mx_finalize, solution is to add an external entity
that does the init and finalize (based on ref counting).

This commit was SVN r13576.
2007-02-09 03:19:38 +00:00
Jeff Squyres
260f1fd468 Fixes trac:817
The C++ bindings were not tracking keyvals properly -- they were
freeing some internal meta data when Free_keyval() was called, not
when the keyval was actually destroyed (keyvals are refcounted in the
C layer, just like all other MPI objects, because they can live for
long after their corresponding Free call is invoked).  This commit
fixes this problem and several other things:

 * Add infrastructure on the ompi_attribute_keyval_t for an "extra"
   destructor pointer that will be invoked during the "real"
   constructor (i.e., when OBJ_RELEASE puts the refcount to 0).  This
   allows calling back into the C++ layer to release meta data
   associated with the keyval.
 * Adjust all cases where keyvals are created to pass in relevant
   destructors (NULL or the C++ destructor).
 * Do essentially the same for MPI::Comm, MPI::Win, and MPI:Datatype:
   * Move several functions out of the .cc file into the _inln.h file
     since they no longer require locks
   * Make the 4 Create_keyval() functions call a common back-end
     keyval creation function that does the Right Thing depending on
     whether C or C++ function pointers were used for the keyval
     functions.  The back-end function does not call the corresponding
     C MPI_*_create_keyval function, but rather does the work itself
     so that it can associate a "destructor" callback for the C++
     bindings for when the keyval is actually destroyed.
   * Change a few type names to be more indicative of what they are
     (mostly dealing with keyvals [not "keys"]).
 * Add the 3 missing bindings for MPI::Comm::Create_keyval().
 * Remove MPI::Comm::comm_map (and associated types) because it's no
   longer necessary in the intercepts -- it was a by-product of being
   a portable C++ bindings layer.  Now we can just query the C layer
   directly to figure out what type a communicator is.  This solves
   some logistics / callback issues, too.
 * Rename several types, variables, and fix many comments in the
   back-end C attribute implementation to make the names really
   reflect what they are (keyvals vs. attributes).  The previous names
   heavily overloaded the name "key" and were ''extremely''
   confusing.

This commit was SVN r13565.

The following Trac tickets were found above:
  Ticket 817 --> https://svn.open-mpi.org/trac/ompi/ticket/817
2007-02-08 23:50:04 +00:00
Jeff Squyres
33619d6b43 Minor fixes for the ompi_bitmap class that were found while
investivating #817:

 * Remove use of legal_numbits member and always just use the full
   size of the array.  There was a corner case where legal_numbits was
   not an even multiple of the number of bits in the array where bits
   would not get freed properly, ususally causing wasted fortran
   MPI handles, or, as in the case of #817, wasted attribute keyvals
   (i.e., the user freed them, but the bitmap didn't reflect the
   free).
 * Re-order some error checks to ensure that we don't segv (we don't
   currently trigger this problem anywhere; I just noticed it while
   doing the other attribute keyval and legal_numbits work).

Since this change affects all Fortran MPI handles, I ran all the intel
and ibm tests and all still pass with this change.

This commit was SVN r13561.
2007-02-08 18:20:36 +00:00
Gleb Natapov
4e5deec496 Fix previous patch. In case of different sm base use pointer to tail after
recalculating it.

This commit was SVN r13557.
2007-02-08 14:59:18 +00:00
Gleb Natapov
c56497cf46 Fix race condition, that happens in circular buffer if consumer drained full
cb before producer set cb_overflow flag.

This commit was SVN r13552.
2007-02-08 07:16:14 +00:00
Jelena Pjesivac-Grbovic
6efca498ec Fixes trac:692 in trunk: receive buffer in MPI_Reduce operation is no longer overwritten on non-root nodes.
This commit was SVN r13538.

The following Trac tickets were found above:
  Ticket 692 --> https://svn.open-mpi.org/trac/ompi/ticket/692
2007-02-07 18:57:03 +00:00
Ralph Castain
c7be9a7121 Complete backout of prior sched_yield and paffinity changes
This commit was SVN r13530.
2007-02-07 14:22:37 +00:00
Jeff Squyres
6ed6864fc4 Include the major, minor, release numbers in mpi.h.
This commit was SVN r13529.
2007-02-07 03:19:44 +00:00
Josh Hursey
90f449f675 fix a typo that got in there
This commit was SVN r13523.
2007-02-06 20:56:48 +00:00
George Bosilca
575075ea77 typo.
This commit was SVN r13510.
2007-02-06 16:53:44 +00:00
Jeff Squyres
0732c555de Refs trac:853
Add new function opal_get_num_processors() that will return the number
of processors on the local host.  Does the Right thing in POSIX
environments (to include a special case for OS X), and will shortly do
the Right Thing for Windows (this commit includes a change to
configure, so I wanted to get that in before the US workday -- the
Windows code can some shortly because it won't involve configury
changes).

This commit was SVN r13506.

The following Trac tickets were found above:
  Ticket 853 --> https://svn.open-mpi.org/trac/ompi/ticket/853
2007-02-06 12:03:56 +00:00
Jeff Squyres
c91fcd7fbd Fix a bunch of minor typos submitted by Bernhard Fischer.
This commit was SVN r13505.
2007-02-06 12:00:30 +00:00
George Bosilca
eaa164d52f Don't check for the DT_FLAG_DATA. It is never set. This was intended to be
a protection against creating subarrays with MPI_LB and MPI_UB. But, I don't
think the MPI standard state anything about this.

This commit was SVN r13504.
2007-02-06 03:58:24 +00:00
Karen Norteman
410f657d07 Fixes trac:876
This commit was SVN r13502.

The following Trac tickets were found above:
  Ticket 876 --> https://svn.open-mpi.org/trac/ompi/ticket/876
2007-02-05 22:26:02 +00:00
Brian Barrett
09cc9e4941 properly compute starting offset -- the lb will be included in the offset, so we don't need
both.

Refs trac:864

This commit was SVN r13494.

The following Trac tickets were found above:
  Ticket 864 --> https://svn.open-mpi.org/trac/ompi/ticket/864
2007-02-05 18:12:18 +00:00
Galen Shipman
ec610a9e65 spread priorities out a bit..
This commit was SVN r13487.
2007-02-04 00:55:25 +00:00
Galen Shipman
ddf08cb0b3 woops..
This commit was SVN r13482.
2007-02-03 02:32:00 +00:00
Galen Shipman
a94101fa62 mostly another hack around for PML selection, allows CM be select itself if an
MTL is available, if not OB1 is used. Still prevents DR and OB1 from stomping
on each other though. 

This commit was SVN r13481.
2007-02-03 02:01:18 +00:00
Christian Bell
e04c55af00 Fixes to psm mtl following a more comprehensive testing of intel tests.
This commit was SVN r13471.
2007-02-02 21:55:04 +00:00
Terry Dontje
244fb76002 This commit fixes trac:856.
This commit was SVN r13464.

The following Trac tickets were found above:
  Ticket 856 --> https://svn.open-mpi.org/trac/ompi/ticket/856
2007-02-02 12:48:26 +00:00
George Bosilca
0ff2115964 Other warnings are now silenced.
This commit was SVN r13462.
2007-02-02 06:47:35 +00:00
Jelena Pjesivac-Grbovic
e193d625bc Bugfix for ring allreduce algorithm.
The step used to iterate through buffer was function of true_extent instead of extent.

This may or may not solve ticket #689 because I am still getting failures over btl mx, 
but I cannot reproduce failures over mtl mx nor tcp.

This commit was SVN r13459.
2007-02-02 02:44:16 +00:00
Karen Norteman
59cc09fd1a Fixes trac:850, updates man pages
This commit was SVN r13456.

The following Trac tickets were found above:
  Ticket 850 --> https://svn.open-mpi.org/trac/ompi/ticket/850
2007-02-01 21:57:48 +00:00
George Bosilca
24997841da Another patch to silence the compilers.
This commit was SVN r13455.
2007-02-01 21:52:44 +00:00
Karen Norteman
4dc71969b1 trac bug 850
This commit was SVN r13452.
2007-02-01 21:29:06 +00:00
Karen Norteman
f545127e7d trac bug 850
This commit was SVN r13451.
2007-02-01 21:28:22 +00:00
Karen Norteman
984d764259 trac bug 850
This commit was SVN r13450.
2007-02-01 21:27:38 +00:00
Karen Norteman
8dd7e175cd trac bug 850
This commit was SVN r13449.
2007-02-01 21:26:59 +00:00
Karen Norteman
47a3f7f760 trac bug 850
This commit was SVN r13448.
2007-02-01 21:26:18 +00:00
Karen Norteman
40eeeaef39 trac bug 850
This commit was SVN r13447.
2007-02-01 21:25:35 +00:00
Karen Norteman
e304eec9dc trac bug 850
This commit was SVN r13446.
2007-02-01 21:24:37 +00:00
Karen Norteman
8a627e8957 trac bug 850
This commit was SVN r13445.
2007-02-01 21:23:54 +00:00
Karen Norteman
25b153848d trac bug 850
This commit was SVN r13444.
2007-02-01 21:23:13 +00:00
Karen Norteman
0c30c74b0a trac bug 850
This commit was SVN r13443.
2007-02-01 21:22:34 +00:00
Karen Norteman
7a60f878a9 trac bug 850
This commit was SVN r13442.
2007-02-01 21:21:43 +00:00
Karen Norteman
11b91057ec trac bug 850
This commit was SVN r13441.
2007-02-01 21:21:00 +00:00
Karen Norteman
180ff99233 trac bug 850
This commit was SVN r13440.
2007-02-01 21:20:14 +00:00
Karen Norteman
a8b066798a trac bug 850
This commit was SVN r13439.
2007-02-01 21:19:08 +00:00
Karen Norteman
e13398b68d trac bug 850
This commit was SVN r13438.
2007-02-01 21:16:43 +00:00
Karen Norteman
cea22389bb trac bug 850
This commit was SVN r13437.
2007-02-01 21:15:43 +00:00
Jeff Squyres
321af727c0 Add external support for MPI_2COMPLEX and MPI_2DOUBLE_COMPLEX. The
machinery was already there; we were just missing the INTEGER's for
them in mpi.f and the #define's for them in mpi.h.

This commit was SVN r13432.
2007-02-01 19:37:38 +00:00
George Bosilca
1c7c39b32b I miss this warnings on my last commit.
This commit was SVN r13431.
2007-02-01 19:34:21 +00:00
Ralph Castain
3daf8b341b Fix the sched_yield problem for generic environments. We now determine and set sched_yield during mpi_init based on the following logical sequence:
1. if the user has specified sched_yield, we simply do what we are told

2. if they didn't specify anything, try to get the number of processors on this node. Note that we already now get the number of local procs in our job that are sharing this node - that now comes in through the proc callback and is stored in the ompi_proc_t structures.

3. if we can get the number of processors, compare that to the number of local procs from my job that are sharing my node. If the number of local procs exceeds the number of processors, then set sched_yield to true. If not, then be a hog and set sched_yield to false

4. if we can't get the number of processors, default to conservative behavior and set sched_yield to true.

Note that I have not yet dealt with the need to dynamically adjust this setting as more processes are added via comm_spawn. So far, we are *only* looking within our own job. Given that we have now moved this logic to mpi_init (and away from the orteds), it isn't yet clear to me how a process will be informed about the number of procs in *other* jobs that are also sharing this node.

Something to continue to ponder.

This commit was SVN r13430.
2007-02-01 19:31:44 +00:00
George Bosilca
79ea6d471b Even less warnings.
This commit was SVN r13429.
2007-02-01 19:27:11 +00:00
George Bosilca
56ffbfc5ff Get rid of the warnings in the Open IB BTL.
This commit was SVN r13424.
2007-02-01 19:07:04 +00:00
George Bosilca
f3a5fe488e No more warnings from the datatype engine.
This commit was SVN r13423.
2007-02-01 18:57:48 +00:00
Brian Barrett
49bcec64e4 More sm startup time reduction
* The real fix, don't leave the OOB in blocking mode during comm_dyn_init(),
    as it means no progressing MPI events while the event library is waiting
    for TCP stuff to come in.
  * Add many comments explaining the reasons for the current ordering

This commit was SVN r13422.
2007-02-01 18:47:43 +00:00
George Bosilca
9e3962b85d Working around the warnings. Unfortunately, this patch will be split
in several commits.

This commit was SVN r13420.
2007-02-01 17:58:18 +00:00
George Bosilca
b611e6d7dc Less warnings.
This commit was SVN r13419.
2007-02-01 17:51:43 +00:00
George Bosilca
6ef3917741 Allow the user to specify the bandwidth and latency for the MX device.
This commit was SVN r13418.
2007-02-01 17:51:00 +00:00
Brian Barrett
58b325b03f Two changes to improve the sm situation with spawn:
* have the mpool size be based on MCW, not num procs
    in other jobs we know about.  Solves the problem of
    the spawned job having a much bigger than needed
    sm file
  * Can't assume that "me" is in the list of procs
    passed to addprocs, so need to use slightly different
    logic and not go through all of add procs unless
    there's a proc in my job that isn't me.

This seems to greatly improve the situation, although
there still seems to be more of a slowdown through
MPI_INIT for the children (if there are more than one
child) than MPI_INIT for the parent if there are 'n'
children compared to 'n' parents.  Hopefully that
made sense ;)

This commit was SVN r13417.
2007-02-01 17:18:35 +00:00
Pak Lui
01c45b9637 * should fix the uninitialized function pointers. This is the patch to the
patch in r13391

This commit was SVN r13411.

The following SVN revision numbers were found above:
  r13391 --> open-mpi/ompi@289fbd08de
2007-02-01 05:07:53 +00:00
Brian Barrett
a0b40ce45a Fix race condition in setting MPI_ERROR -- with buffered sends, the
request can complete before the operation, meaning that a bogus MPI_ERROR
is read

This commit was SVN r13401.
2007-01-31 21:40:14 +00:00
Brian Barrett
039a3d8c17 add comment about why there's no status update here, since I always forget
This commit was SVN r13400.
2007-01-31 21:39:20 +00:00
Brian Barrett
846eed84f1 When receiving a message, need to account for the fact that the displacement
of the first entry might not be the start of the user's buffer.  This is
similar to what ompi_convertor_unpack does.  This is the solution for
the test case attached to ticket #690.

Refs trac:690

This commit was SVN r13397.

The following Trac tickets were found above:
  Ticket 690 --> https://svn.open-mpi.org/trac/ompi/ticket/690
2007-01-31 18:18:19 +00:00
Brian Barrett
65b07140c0 clean up some of the printf warnings caused by the attribute code
This commit was SVN r13395.
2007-01-31 17:11:06 +00:00
Pak Lui
289fbd08de * Fix the MPI::Datatype::Create_keyval() & MPI::Win::Create_keyval()
not being able to take C function pointers for either of the 
	copy or the delete fn. Fix by overloading the Create_keyval methods. 
    Fix trac #737, #738. Reviewed by jsquyres.
  *	A couple of cxx tests in ompi-tests (winkeyval.cc & typekeyval.cc) 
	will be re-enabled to regression test for this fix.

This commit was SVN r13391.
2007-01-31 15:00:41 +00:00
George Bosilca
a02d1c7c8d No more warnings.
This commit was SVN r13382.
2007-01-31 04:27:41 +00:00
George Bosilca
8cb4024903 Assert in debug mode before segfault.
This commit was SVN r13380.
2007-01-31 04:08:13 +00:00
Brian Barrett
ee753694e0 Print out the memlock limit when we can't allocate memory
This commit was SVN r13372.
2007-01-30 21:22:56 +00:00
Rainer Keller
061ba05439 - Fixes uncovered with the format attribute to
opal_output and opal_output_verbose

This commit was SVN r13371.
2007-01-30 20:56:31 +00:00
Rainer Keller
01ba38661d - First usage of the some of the attributes
This commit was SVN r13370.
2007-01-30 20:54:06 +00:00
Jeff Squyres
86f8c66a27 Turns out that the leave_pinned stuff isn't used in these BTLs at
all.  So just remove it.

This commit was SVN r13360.
2007-01-30 15:39:49 +00:00
Rainer Keller
3669e8921e - Fix further compiler warnings regarding initialization
and shadowing variables.

This commit was SVN r13358.
2007-01-30 06:34:38 +00:00
Jeff Squyres
e90b3e415b * Before this commit, if we called ompi_mpi_abort() before MPI_INIT
completed successfully, Bad Things(tm) could happen.
 * Now we explicitly check orte_initialized (a new global in ORTE
   indicating whether we are between orte_init() and orte_finalize()
   or not), and if so, react accordingly.
 * If ORTE is initialized, use orte_system_info.nodename; otherwise,
   use gethostname().
 * Add loop protection to ensure that ompi_mpi_abort() is not invoked
   multiple times recursively.

This commit was SVN r13354.
2007-01-29 22:01:28 +00:00
Jeff Squyres
a45e8bea05 Fixes trac:830. Put in argument checking for MPI_INIT_THREAD.
This commit was SVN r13353.

The following Trac tickets were found above:
  Ticket 830 --> https://svn.open-mpi.org/trac/ompi/ticket/830
2007-01-29 21:56:15 +00:00
Jeff Squyres
0ce78f22fa Compliments r13351 -- use the new field on the ompi_proc_t struct to
know what my local rank is, and therefore set my paffinity ID as
appropriate.  Specifically, we're no longer relying on the
special/secret mpi_paffinity_processor MCA parameter that the orted
would set for us.

This allows processor affinity to be used in environments where the
orted is not used (e.g., bproc, and someday in the hopefully not
too-distant future, SLURM).

This commit was SVN r13352.

The following SVN revision numbers were found above:
  r13351 --> open-mpi/ompi@a338b7e533
2007-01-29 21:53:04 +00:00
Ralph Castain
a338b7e533 Update the proc system to compute and store the number of local procs and my local relative rank during the callback at STG1. This info will be used to locally compute and set the processor affinity and sched_yield (for oversubscribed conditions).
Over to Jeff now for modifying mpi_init accordingly.

Until Jeff makes his changes, nobody should see anything different as the new info just isn't used by anything!

This commit was SVN r13351.
2007-01-29 20:54:55 +00:00
Jeff Squyres
c9f072b84f Strike down a few more stray places that were registering
mpi_leave_pinned and replace them with the one central global
variable.

This commit was SVN r13349.
2007-01-29 20:24:31 +00:00
Brian Barrett
93a2f31932 Use a recursive halving communication algorithm similar to the one used by
MPICH2 for "small" commutative operations in the reduce_scatter basic
implementation.  "small" is currently pretty big, as it doesn't take
much to beat reduce/scatterv.  Need to do much more than this for
better all around performance of MPI_Reduce_scatter, but this was enough
to solve the problems I was having.

This commit was SVN r13348.
2007-01-29 19:29:35 +00:00
Jeff Squyres
61bc9fed3c Duh. Free the temp array when we're done with it. While we're making
this switch, use a clear variable name, too.

This commit was SVN r13347.
2007-01-29 19:04:57 +00:00
Jeff Squyres
e6a0069e95 Fix from an as-yet uncommitted test that Pak will commit shortly.
Found another places that we were incorrectly casting a C++ MPI handle
array to the corresponding C array type and hoping for the best (which
won't work at all).  This commit fixes things so that we now do the
proper conversion between C<-->C++ handles.

This commit was SVN r13346.
2007-01-29 18:44:46 +00:00
Rainer Keller
ca35881cd0 - Minor bugfixes and removed compiler warnings
This commit was SVN r13343.
2007-01-28 19:52:09 +00:00
Jeff Squyres
3c5c8c3c4c Refinement of Rainer's r13227 and r13228 (worked with Rainer, Ralph,
and George on these refinements):

 * Rename the static OBJ initializer macro to be
   OPAL_OBJ_STATIC_INIT(class)
 * Ensure that all static OBJ initializations get a refcount of 1
   (doesn't ''really'' matter, since they're static, it should never
   get to the point where the OBJ is DESTRUCTed, but more correct
   nonetheless)
 * Add a "magic number" to the OBJ when compiling with debug support.
   The magic number does some rudimentary support to ensure that
   you're operating on a valid OBJ (and fails an assertion if you're
   not).  Check to ensure that the memory contains the magic number
   when performing actions of OBJ's.  Also remove the magic number
   when DESTRUCTing OBJs, so that if, for example, an OBJ is
   DESTRUCTed more than once, we'll fail the magic number assert.

This commit was SVN r13338.

The following SVN revision numbers were found above:
  r13227 --> open-mpi/ompi@96030de97b
  r13228 --> open-mpi/ompi@c2e9075d29
2007-01-27 13:44:03 +00:00
Jelena Pjesivac-Grbovic
33dcb4f810 Minor change to linear alltoall algorithm:
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should 
minimize the time spent in matching the requests.

I did not see any performance difference over MX up to 64 nodes, but 
the change makes sense and may have some impact when we have (many) 
more nodes.

This commit was SVN r13337.
2007-01-26 21:59:31 +00:00
Brian Barrett
385a435813 Start long message send as soon as possible, to minimze ack time for the receive,
greatly increasing mid-range bandwidth

This commit was SVN r13317.
2007-01-25 23:07:03 +00:00
Rich Graham
1c20feb52b Take into account constants that in the cray headers are defined different than in the portals spec.
This commit was SVN r13311.
2007-01-25 18:32:47 +00:00
Jeff Squyres
7b6ed64c7b Add in the hostname to the BTL_* output macros so that you can tell on
which node an event occurred.

This commit was SVN r13302.
2007-01-25 14:02:54 +00:00
George Bosilca
f9a3bbfd7a Don't miss the ODLS component from the output.
This commit was SVN r13299.
2007-01-25 08:37:36 +00:00
Jeff Squyres
b9120aed6d Always make sure to create $(includedir)/openmpi, even if we were
configured with --disable-mpi-cxx so that the default -I flags in the
wrapper compilers don't point to a directory that doesn't exist.
Thanks to Martin Audet for identifying the problem.

This commit was SVN r13296.
2007-01-25 01:59:22 +00:00
Jeff Squyres
6fea000e5f Oops -- get the right function name (copy-n-paste error).
This commit was SVN r13290.
2007-01-24 22:31:13 +00:00
Jeff Squyres
6b69ea664d Make a much, much better error message for a not-uncommon failure
scenario (user/sysadmin forgot to set the memlock limits high
enough).

This commit was SVN r13289.
2007-01-24 22:25:40 +00:00
Patrick Geoffray
b252cb82c8 oops, ".", not "->", copy error...
This commit was SVN r13287.
2007-01-24 19:16:46 +00:00
Patrick Geoffray
d58f6b2451 Free memory in synchronous send case if free_after requires it.
Fixes memory leak using synchronous sends and custom data types.

This commit was SVN r13286.
2007-01-24 19:10:38 +00:00
George Bosilca
d19a4f4740 Cast it to make cl happy.
This commit was SVN r13267.
2007-01-24 00:51:01 +00:00
George Bosilca
790f175d4e Explicit conversions to make the code Windows friendly.
This commit was SVN r13266.
2007-01-24 00:50:24 +00:00
George Bosilca
a4488ff8d2 Add explicit conversions.
This commit was SVN r13265.
2007-01-24 00:49:08 +00:00
George Bosilca
6f720f0d26 Add all required explicit conversions in order to be able
to build on Windows.

This commit was SVN r13264.
2007-01-24 00:48:16 +00:00
Jeff Squyres
c9fe68c406 Better patch from Gleb to do the per-port (endpoint) specification of
whether to use eager RDMA or not

This commit was SVN r13262.
2007-01-23 22:40:59 +00:00
Jelena Pjesivac-Grbovic
5cbcf42dc3 Removing yet another unsed variable (missed it in previous submit).
This commit was SVN r13259.
2007-01-23 21:30:57 +00:00
Jelena Pjesivac-Grbovic
afbd032ff9 Removing compiler warnings about comparison of unsigned values to signed ones, and
unused variables.

This commit was SVN r13258.
2007-01-23 21:10:07 +00:00
Jelena Pjesivac-Grbovic
568477ade8 Adding new Allreduce algorithms, updating allreduce decision function, and cleaning up util.
- Allreduce algorithms:
  - Recursive doubling is used for small messages (up to 10KB) and can be used for 
    both commutative and non-commutative operations.  
	 Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
	 Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with 
	 MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
  - Ring algorithms performs well for larger messages but cannot be used for 
    non-commutative operations.  It passed the same tests as recursive doubling, except
	 some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
	 (which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util.  Moving isendrecv to static inline just like sendrecv. 

This commit was SVN r13252.
2007-01-23 01:19:11 +00:00
Jeff Squyres
3389a523e9 Arrgh. That printf should not have been in there!
This commit was SVN r13243.
2007-01-22 18:52:49 +00:00
Jeff Squyres
a24f3c0886 Move the "use eager RDMA" flag to the individual openib BTL modules,
not the component.  This potentially allows for a mix of HCAs that
support eager RDMA and those who do not on a port-by-port basis.

This commit was SVN r13242.
2007-01-22 18:49:32 +00:00
Jeff Squyres
91b855c2f4 Minor fixes in the help messages
* If the text to cite where the problem occurred is "\n", prettyprint
   somethign a little nicer so that it's clear that we're talking
   about the end of line
 * Add a missing help message ("ini file:unknown field"), and display
   it a little better (i.e., show the erroneous field, not a
   misleading "end of line" marker)
 * It's "OpenIB", not "Open IB"

This commit was SVN r13241.
2007-01-22 18:45:43 +00:00
George Bosilca
242292673a sendrecv is a static inline.
This commit was SVN r13237.
2007-01-22 05:50:23 +00:00
George Bosilca
c0b22678a7 OMPI_ENABLE_DEBUG is always defined. Remove all warnings when compiled in
non debug mode.

This commit was SVN r13230.
2007-01-21 18:14:06 +00:00
Rainer Keller
c2e9075d29 - Define a OPAL_CLASS_EMPTY to be used for initialization.
Similar within the dt_module for the predefined datatypes.

This commit was SVN r13228.
2007-01-21 15:52:06 +00:00
Rainer Keller
96030de97b - Initialize the size of the opal_object class.
- Use the OBJ_CLASS_INSTANCE macro to initialize classes.
   This also gets rid of several missing initialization errors.

This commit was SVN r13227.
2007-01-21 14:24:29 +00:00
Jeff Squyres
52ca6cf86c The mpi_leave_pinned and mpi_leave_pinned_pipeline MCA parameters were
needlessly registered in multiple different places, and none of them
had a good help string.  There was also an inconsistent check for
setting both mpi_leave_pinned and mpi_leave_pinned_pipeline (i.e., it
was only in ob1).  This commit moves the registration of these params
to one central place (ompi/runtime/ompi_mpi_params.c, with all other
mpi_* MCA params) and uses globals to propagate the values as
relevant.  The error check was also moved to the central location to
ensure that we can consistency everywhere.

This commit was SVN r13226.
2007-01-21 14:02:06 +00:00
Jeff Squyres
32bfbfc735 Correct a filename that would prevent show_help messages from
appearing properly.

This commit was SVN r13225.
2007-01-21 13:56:16 +00:00
Rainer Keller
3f397f3848 - Do not convert the cart_comm -- its intent is out.
This commit was SVN r13224.
2007-01-21 13:00:23 +00:00