1
1
Граф коммитов

970 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
8119c970db Improve the connection algorithm for MX. There are 2 problems here:
- first we setup the connections in the begining with all the peers
- MX does not handle well the case where several peers make connections to the same
  destination simultaneously.

So I change the order in which we connect. First we compute our rank in the array,
then in a round-robin fashion we setup connection starting with our left neighboard.

This commit was SVN r8075.
2005-11-10 01:15:49 +00:00
George Bosilca
dc1ad885d1 Move the output message outside the loop. We print an error message only once when we fail to
connect to a peer. Bonus, we print some additional informations like its MAC Address or name
if it's on our tables.

This commit was SVN r8074.
2005-11-10 01:13:18 +00:00
Tim Woodall
4c7c277b0a improve the scalability of MPI_Waitall ...
note that any code that sets a request to a completed state must
now increment a counter for every completed request

This commit was SVN r8073.
2005-11-10 00:45:27 +00:00
Tim Woodall
62fd74140b decrease socket buffers sizes to same as ptl code
This commit was SVN r8072.
2005-11-10 00:40:55 +00:00
Tim Woodall
2f6d50e0c6 init rdma count
This commit was SVN r8071.
2005-11-10 00:04:25 +00:00
Tim Woodall
b5ed723ea4 - check for null return
- disable debug

This commit was SVN r8070.
2005-11-10 00:02:18 +00:00
George Bosilca
55051b81c4 Activate the protection against unavailable datatypes. They get a flag DT_FLAG_UNAVAILABLE. We check now this flag in all the send/recv operations via the macros on mpi/c/bindings.h.
This flag is inherited by all datatypes create with unavailable datatypes. Basically, we let the user create the wrong datatype but we dont let him using it for any pt2pt communications or any pack/unpack.

This commit was SVN r8069.
2005-11-09 23:43:41 +00:00
Tim Woodall
78c98386d7 should reset the count (for persistent requests)
This commit was SVN r8064.
2005-11-09 22:02:48 +00:00
Tim Woodall
58b46d2da0 return mpool resources when request completes rather than in free
This commit was SVN r8063.
2005-11-09 21:59:01 +00:00
Graham Fagg
6b99301893 extra verbose in debug mode to help occ
This commit was SVN r8061.
2005-11-09 21:01:35 +00:00
Edgar Gabriel
b3d3552900 Fix for a problem Brian pointed out with cartesian communicators: in
comm_fill_rest there is no need for calling ompi_set_group_rank, since
we know already the rank of the process in the new comm. In case the
process was not part of the new communicator (rank = MPI_UNDEFINED)
calling this function caused a segfault on some platforms.

This commit was SVN r8060.
2005-11-09 21:00:58 +00:00
George Bosilca
a6fdc2b2b4 Turn off the missing data-type message on MPI_Init.
This commit was SVN r8056.
2005-11-09 17:34:44 +00:00
Galen Shipman
3079fc2da1 use correct lock for threaded build..
This commit was SVN r8055.
2005-11-09 16:09:05 +00:00
George Bosilca
de0676a3dd Do not do any local copy into the storage if this convertor is finished. This is usefull in the Pack/Unpack case when there is more data in the packed buffer than the one we try to extract.
This commit was SVN r8054.
2005-11-09 07:46:12 +00:00
George Bosilca
025a8a04c5 More optimization of the data-type description are now possibles. Some corner cases are corrected. As a result we discover more accurately the contiguous part of the data memory layout.
This commit was SVN r8051.
2005-11-09 00:02:39 +00:00
Tim Woodall
78522ed454 send credits on correct qp
This commit was SVN r8050.
2005-11-08 22:59:44 +00:00
George Bosilca
63ba3bde11 Allow the convertor to remember the last trucated unpack. If the same convertor is used for the next unpack it will put the data back correctly. However, if the BTL/PTL create a new convertor, even if it clone the last one this magic will not happens !
This commit was SVN r8048.
2005-11-08 21:48:48 +00:00
George Bosilca
cdfe5e71fd By default there is no pending length on the convertor.
This commit was SVN r8047.
2005-11-08 21:45:45 +00:00
Tim Woodall
b4ca28da4b removed debug
This commit was SVN r8046.
2005-11-08 21:41:02 +00:00
Jeff Squyres
6de5c208f2 Fix propblem with prototypes for wtick and wtime in prototypes_pmpi.h.
This commit was SVN r8043.
2005-11-08 19:45:51 +00:00
George Bosilca
7582ae3ef1 A simpler way to get output about the packing/unpacking. Now there are 2 MCA parameters datatype_pack_debug and datatype_unpack_debug. When they are set to 1 the ddt engine will dump a lot of messages. Dont turn them to one by default. But if you notice any problems in the ddt you can turn them to one and send me the output.
First step toward adding memory to the convertor. It will be able to keep partial basic datatype between calls ...

This commit was SVN r8042.
2005-11-08 17:49:51 +00:00
George Bosilca
2b9b5500b9 Change some variable's names.
This commit was SVN r8041.
2005-11-08 17:44:56 +00:00
George Bosilca
c63e4dcef9 When we finish one of the loops take care of the index of the begining of the loop. If it's -1 then we just complete the full datatype ... therefore we have to do something special.
This commit was SVN r8040.
2005-11-08 16:53:31 +00:00
Tim Woodall
2d9c509add flow control
This commit was SVN r8039.
2005-11-08 16:50:07 +00:00
Graham Fagg
bcf8744bf6 valgrind saved me from a nasty order of eval error... i.e. derefing slected_data before setting it.
Anyway fixed and no memory leaks in coll tuned so far.

This commit was SVN r8037.
2005-11-08 04:52:30 +00:00
Graham Fagg
5b3ba944a8 Enabled, and running...
todos. turn the debug messages  into ompi ignorables and inot do some ops in ompi_bug mode

This commit was SVN r8036.
2005-11-08 04:43:17 +00:00
Graham Fagg
833b558046 Full configuration file based control of tuned collectives.
(verbose on bad config file and even cleans up after itself enought to make valgrind happy).

This commit was SVN r8035.
2005-11-08 03:36:38 +00:00
George Bosilca
579398a135 Change some variable names (from pSrc to something more clear like user_memory and/or packed_buffer).
This commit was SVN r8034.
2005-11-08 03:12:58 +00:00
Graham Fagg
39207db7cd removed the n-dimmension rule base..
replacing it was simpler code for V1

This commit was SVN r8033.
2005-11-08 03:03:51 +00:00
George Bosilca
4ed2da50e9 A step forward. The original displacement for contiguous data with gaps is now correctly computed. At least the original displacement.
This commit was SVN r8031.
2005-11-08 00:03:05 +00:00
George Bosilca
387390355c Shame on me ... there should be extent not displacement.
This commit was SVN r8030.
2005-11-08 00:02:14 +00:00
George Bosilca
ce65ef3c6e And here is the makefile that integrate the new files. Now ... have as much fun as I did :)
This commit was SVN r8029.
2005-11-07 23:25:12 +00:00
George Bosilca
ccbeb6ac5a Take in account the original displacement for contiguous datatypes.
Limit the amount of data to be packed to the remaining on the convertor. This make the things a lot simpler in the pack/unpack functions.

This commit was SVN r8028.
2005-11-07 23:24:13 +00:00
George Bosilca
5641f4f56b Change the name of one of the fields in the end_loop structure.
Update all the macros to reflect the change.
A slightly different version of the boundaries checking function.

This commit was SVN r8027.
2005-11-07 23:22:43 +00:00
George Bosilca
1ddb90bbae Slim fast ... Do as less as possible on the critical path. The most expensive function now is the one that compute the stack when we move to a new position. For this function there are
several versions depending on the type of the data annd the position where we want to go.

This commit was SVN r8026.
2005-11-07 23:21:27 +00:00
George Bosilca
461f607fd3 Add one prototype from the new_position.c
This commit was SVN r8025.
2005-11-07 23:19:54 +00:00
George Bosilca
8df200528d The END_LOOP structure change the name of one of it's fields.
This commit was SVN r8024.
2005-11-07 23:18:57 +00:00
George Bosilca
f7359e24d6 Add some macros in the begining of the file. They are not used right now, but they will be in few days.
Do not ignore the type and extent of the last optimized basic type in some special cases.

Update the last fake END_LOOpP with the correct value for the first_elem_disp field.

This commit was SVN r8023.
2005-11-07 23:17:00 +00:00
George Bosilca
53cb3c2bee Force the data name to the empty string when we call destroy.
This commit was SVN r8022.
2005-11-07 23:14:32 +00:00
George Bosilca
8799d1799a The shiny new pack and unpack functions. The big difference is that the displacement is
never stored on the stack. It is partially stored on the stack depending on the loops
but every time we pack/unpack a basic datatype we take in account again it's displacement.
This approach make the whole logic a lot simpler. In same time I split the big functions
in several basic block.

This commit was SVN r8021.
2005-11-07 23:13:04 +00:00
George Bosilca
334ca349fe Several bug fixes:
- if the alignment of wchar is zero then wchar_t is not supported by the OS. We skip it.
- Now that the definition of end_loop change compute the first_elem_description for all
  predefined datatypes.
- In debug mode print a list of the datatypes that are not supported by the current
   architecture.

This commit was SVN r8020.
2005-11-07 23:10:33 +00:00
George Bosilca
84a89d68dc When we advance the convertor by a multiple of the data size there is a quick optimization.
We can compute the number of complete datatype that we will advance, update the stack and
then compute the new position taking in acount only the remaining bytes.

This commit was SVN r8019.
2005-11-07 23:00:28 +00:00
Jeff Squyres
a1ba3168d9 Remove extrameous comments
This commit was SVN r8017.
2005-11-07 22:44:26 +00:00
George Bosilca
9832d5d883 The OMPI_GENERATE_F77_BINDINGS work only for the most common F77 bindings, the
one that does not return any value. There are 2 exceptions MPI_Wtick and MPI_Wtime.
For these 2 we can insert the bindings manually.

This commit was SVN r8016.
2005-11-07 19:37:32 +00:00
Brian Barrett
28891d6de3 * Move MPI_Wtime and MPI_Wtick back out of mpi.h and into the C bindings library,
restoring the PMPI version.  A variety of reasons for this:

  - mpi.h was blinding using inline in a C header without the configrue mojo
    properly set it, as mpi.h doesn't include ompi_config.h.  This eventually
    would have caused a borked build.
  - mpi.h and mpif.h were never updated to not include PMPI_W{tick,time} as
    a proper prototype
  - The C++ and F90 bindings didn't do the right things when there was no
    PMPI version of the C call, but profiling was enabled
  - Since we only use gettimeofday, the function call overhead really doesn't
    matter

This should probably go to the 1.0 branch

This commit was SVN r8014.
2005-11-07 17:22:48 +00:00
Jeff Squyres
60b19dcf63 Add missing functions for MPI_LONG_LONG, MPI_LONG_LONG_INT, and
MPI_UNSIGNED_LONG_LONG.

This commit was SVN r8010.
2005-11-07 14:42:46 +00:00
Jeff Squyres
21be5e18ee - Fix the MPI_Op intrinsic operation string names ("MPI_foo", not
"MPI_OP_foo")
- Remove all the handlers for MPI_REPLACE for general reductions
  (it's only defined for MPI_ACCUMULATE, and ACCUMULATE is handled
  differently than the other reductions, so it's safe to make all the
  maps for REPLACE be empty)

This commit was SVN r8008.
2005-11-07 13:30:17 +00:00
George Bosilca
288cdaf302 This is the way to compute the position for a convertor under the new rules.
This file is now yet activated. It will became the default after the next
commit.  (checkpoint to start testing on other clusters)

This commit was SVN r8006.
2005-11-07 09:00:52 +00:00
George Bosilca
c1b713c56e Make a compiler happy about casting.
This commit was SVN r8005.
2005-11-07 04:59:46 +00:00
George Bosilca
7b7aaf897c Do not add epsilon to the data extent if there is a user set UB for the data.
This commit was SVN r8004.
2005-11-07 04:04:20 +00:00
Graham Fagg
dcd3450e06 simplified the building of different rule sets
(also corrected some prototypes missing 'struct')

This commit was SVN r8003.
2005-11-06 22:05:50 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Jeff Squyres
2e10d0c099 Forgot to add the intrinsic op MPI_REPLACE (Brian, the One Sided Bug
Finder, gets credit)

This commit was SVN r7993.
2005-11-04 19:00:35 +00:00
Tim Woodall
e45f4744ee do not return these descriptors to cache
This commit was SVN r7986.
2005-11-03 23:20:38 +00:00
Tim Woodall
26003bc952 fix from release branch - don't use get protocol if more
than one btl is available

This commit was SVN r7984.
2005-11-03 20:52:56 +00:00
Jeff Squyres
9e723b8519 Remove some compiler warnings
This commit was SVN r7973.
2005-11-03 15:26:43 +00:00
Jeff Squyres
4fc135fd2b Looks like I forgot to put DDT support for the optional C datatypes
MPI_UNSIGNED_LONG_LONG, MPI_LONG_LONG, and MPI_LONG_LONG_INT --
although I already had implementations of all the relevant functions
for these types.  Doh!

This commit was SVN r7944.
2005-11-01 03:28:59 +00:00
Graham Fagg
9547a635a9 snapshot while switching systems
but, dynamic rules from a user defined config file is almost there now

This commit was SVN r7943.
2005-11-01 00:19:05 +00:00
Graham Fagg
fe03e068f2 allow forced algorithms (where the user or *test* suite knows better) to
go through the dynamic decision rule interface.
(forced algorithms are set with MCA params)
fixed some silly verbose output with wrong func name in it etc
updates to fixed dec rules.

This commit was SVN r7940.
2005-10-31 20:45:50 +00:00
Tim Woodall
31eb35c3f1 correct rnr parameter - need to review this code and pass correct data type
This commit was SVN r7936.
2005-10-31 17:18:39 +00:00
Edgar Gabriel
2ec5fa5d24 - The component will remove itself from the list of potential collective
modules, if its priority is zero (the default value). Reason for that is
  + if there is no other module with a priority > 0, the hierarchical
    collective module has a problem anyway, since it has to rely on the coll
    modules of the subcommunicators. On the other hand, if its priority is
    zero, it won't be chosen anyway, and we can simply save the
    allreduce/allgather and comm_split operations which might occur during
    hierarchy detection.
  + to improve the startup times until we have the modex thing which we
    discussed with Jeff and Tim in Knoxville in place

- adding an mca parameter indicating a symmetric configuration. This can 
  speed up startup times, since each process can conclude from its data onto
  the data of the other processes -> no need for the allreduce operations. Per
  default  this parameter is set to "no".

This commit was SVN r7932.
2005-10-30 16:01:13 +00:00
George Bosilca
b0def3f6bf MX has 2 limitations regarding the iovecs. First they do not support iovec witha total size
larger than 32K for inter-nodes transfert ... and then they do not support iovecs larger than
16K for inter-node transfert. Therefore we have to set the size of our first fragment to
16K to match both cases.

This commit was SVN r7926.
2005-10-28 20:37:43 +00:00
Jeff Squyres
7bdfe6557b - Update the checks in REDUCE, ALLREDUCE, SCAN, EXSCAN, and
REDUCE_SCATTER to more thoroughly check the datatype/op combination
  to see if it's valid or not.  If it's not, print a meaningful error
  message rather than "Invalid MPI_Op" indicating what specifically
  was wrong (therefore hopefully helping users track down where in the
  code the problem is, and/or telling us that there's a reduction
  operation combo that we don't support that we should)
- The check for whether a datatype is intrinsic needed to be updated
  -- it's not sufficient to check that dtype->id < DT_MAX_PREDEFINED;
  you really need to check the PREDEFINED flag on the datatype.
  Thanks to George for this fix (only intrinsics have a meaningful
  value in dtype->id).

This commit was SVN r7923.
2005-10-28 16:47:32 +00:00
George Bosilca
ab97bde177 Rainer pointer out that the convertor already have the CONTIGUOUS flag is the
data is contiguous (set in ompi_convertor_prepare).
For unpack reinforce the limits of the pack for contiguous types.

This commit was SVN r7914.
2005-10-28 05:27:40 +00:00
George Bosilca
5355765d81 Cleanup has to reset the stack position.
This commit was SVN r7913.
2005-10-28 05:25:08 +00:00
George Bosilca
d916e0c5b4 The (I hope) final solution for the convertor problem. As all the PML inherit
the base send and receive request from the pml_base, we can solve our problem
if we construct the convertor attached to any request in the pml_base_construct
function. At the end of the life time for each request (here life time is 
related to one utilisation, without taking in account the cache) we release
all information attached to the convertors in the _FINI macro by calling the
ompi_convertor_cleanup.

This commit was SVN r7910.
2005-10-28 03:26:36 +00:00
Brian Barrett
bf67c9387b * initialize send request convertor with the correct type (convertor instead
of request).  This fixes at least the bug with NetPIPE in 64bit land that
  Troy was seeing. 

This commit was SVN r7904.
2005-10-27 23:08:27 +00:00
Galen Shipman
4a15761732 add support for srq limit reached async event, even though it doesn't appear
to  be supported by mellanox vapi.. perhaps this will be supported in the near 
future, for now it doesn't hurt to have it in the trunk


Also cleanup the receive descriptor posting macro's.. 

This commit was SVN r7903.
2005-10-27 22:47:19 +00:00
Tim Woodall
3bd5b81dfa Submitted: Gleb Natapov
This commit was SVN r7899.
2005-10-27 17:48:40 +00:00
Tim Woodall
4fc5b2105a this is currently an int - we shouldn't restrict it unless required
This commit was SVN r7895.
2005-10-27 17:06:58 +00:00
Tim Woodall
13409ec53b correction for hang, check for additional fragments before callback,
which may queue a new fragment

This commit was SVN r7889.
2005-10-27 01:39:39 +00:00
Graham Fagg
5bb0d4a053 enable allreduce to be selected
This commit was SVN r7888.
2005-10-26 23:55:37 +00:00
Graham Fagg
2587d7ade9 added some more linear functions.
minor corrections on naming and debug info

This commit was SVN r7887.
2005-10-26 23:51:56 +00:00
Graham Fagg
c3e1dc410d Started to add basic linear functions
Also started to add the allreduce algorithms as I test them
(i.e. if it goes in its after testing from now on)

This commit was SVN r7886.
2005-10-26 23:11:32 +00:00
Jeff Squyres
d47ce065e9 Minor Makefile.am fix for static builds.
This commit was SVN r7882.
2005-10-26 15:57:58 +00:00
Edgar Gabriel
ba3bf6592f fixing some warnings. No idea yet why the static builds fail...
This commit was SVN r7879.
2005-10-26 12:56:56 +00:00
Jeff Squyres
23ab9e0277 A better solution to the previous commit -- RETAIN/RELEASE the MPI_Op
at the top-level MPI API function.  This allows two kinds of
scenarios:

1. MPI_Ireduce(..., op, ...);
   MPI_Op_free(op);
   MPI_Wait(...);

For the non-blocking collectives that we're someday planning -- to
make them analogous to non-blocking point-to-point stuff.

2. Thread 1:
   MPI_Reduce(..., op, ...);
   Thread 2:
   MPI_Op_free(op);

Granted, for #2 to occur would tread a fine line between a correct and
erroneous MPI program, but it is possible (as long as the Op_free was
*after* MPI_reduce() had started to execute).  It's more realistic
with case #1, where the Op_free() could be executed in the same thread
or a different thread.

This commit was SVN r7870.
2005-10-25 19:20:42 +00:00
Edgar Gabriel
d009d8de57 opening the hierarchical collective component to the public. I am at this
stage fairly confident that 
- it works in most scenarious (with symmetric hierarchies, with asymmetric
  hierarchies, wihout hierarchies - it just removes itself)
- it does not create too many problems (I am not aware of any at least)
- it does not slow down startup anymore dramatically (thanks to the fixes of
  Brian, Jeff, Tim and a significant reduction in the number of collective
  operations in the comm_query)
Any feedback is highly welcome.

This commit was SVN r7868.
2005-10-25 18:38:43 +00:00
Edgar Gabriel
00c04ab56a moving the hierarch collective component to the new parameter
registration interface.

This commit was SVN r7867.
2005-10-25 18:34:47 +00:00
Edgar Gabriel
3633605010 moving op_init further up in ompi_mpi_init, since it is required when
quering some of the collective components. Up to now, it just worked
somehow, but now with correct reference counting for ops in place, it
refused :-)

This commit was SVN r7866.
2005-10-25 18:33:48 +00:00
Jeff Squyres
ef09e768e0 Ensure to OBJ_DESTRUCT to free memory during finalize (caught by
Brian). 

This commit was SVN r7864.
2005-10-25 17:27:58 +00:00
Jeff Squyres
f8fd10715c - Minor style fix
- Be sure to properly OBJ_CONSTRUCT the intrinsic MPI_Op's
- RETAIN/RELEASE the op's when used in the invoke function

This commit was SVN r7863.
2005-10-25 16:24:00 +00:00
Jeff Squyres
a9f04c7573 Only do the extra va_* stuff if we're compiling with the compiler that
cares about it (PGI).

This commit was SVN r7860.
2005-10-25 13:08:52 +00:00
Graham Fagg
382f05c7ad Infastructure changes.
started to add static (fixed if) statement based decision rules based on gigE numbers
added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis
(this is not in the fixed call path but only in the dynamic (at com create) call path).
(these params can be used by test suites such as OCC to choice which algorithm they are using).

This commit was SVN r7854.
2005-10-25 03:55:58 +00:00
Graham Fagg
d8e32464cb ops. setting/reading mca option from the right varible would help.
This commit was SVN r7850.
2005-10-24 21:33:48 +00:00
Brian Barrett
1e2f7d6a3d * make sure to expose ompi_op_t as an object
This commit was SVN r7848.
2005-10-24 20:31:14 +00:00
Rainer Keller
d6120d32d6 - Only minor white-space changes, to clean up
This commit was SVN r7843.
2005-10-24 10:36:16 +00:00
George Bosilca
b45651988b Protect against elements with ZERO length.
Remove all the useless code.

This commit was SVN r7827.
2005-10-21 06:48:51 +00:00
George Bosilca
1fb8ec646a Add the homogeneous flag back in the convertor.
Correct/improve one of the comments.
Descrease the amount of memory required for the stack.

This commit was SVN r7826.
2005-10-21 06:47:57 +00:00
Galen Shipman
cb84a57c57 add endpoint and srq flow-control..
Note, we are failing the ring tests in the intel p2p test suite, but we seem
to fail the same tests under the current trunk.. will look into this further. 

This commit was SVN r7823.
2005-10-21 02:21:45 +00:00
Galen Shipman
c4889ac759 update openib mpool to properly deregister and release (carry over from
mvapi). 

Still need to add endpoint and srq flow control as in mvapi 

This commit was SVN r7816.
2005-10-20 03:57:17 +00:00
Galen Shipman
0d1d231169 convert to new mca params, adding description strings.
changed mca param rr_buf_min/max to rd_min/max 
Add bandwidth param to openib 

This commit was SVN r7815.
2005-10-20 02:55:21 +00:00
George Bosilca
75bc3dd43c Dont mess around with the OBJ_DESTRUCT on the communicator. It's quicker (and safer) to call
directly the communicator cleanup function (ompi_convertor_cleanup).

This commit was SVN r7814.
2005-10-19 21:28:52 +00:00
George Bosilca
1d75b7972f Solve thee problem with the reference count on the datatype (RT bug 1492). The problem is that the
convertor (when prepared) increase the reference count on the used datatype. This reference count
will be released only when the OBJ_DESTRUCT is called on a convertor. However, having to call
OBJ_CONSTRUCT and OBJ_DESTRUCT on each request every time we want to use it (even when it come
from the cache) is an expensive operation. This can be avoided is the OBJ_DESTRUCT will leave the
convertor in exactly the same state as OBJ_CONSTRUCT. With this approach we just have to call
OBJ_CONSTRUCT for each convertor once when we initially create the request.

This commit was SVN r7813.
2005-10-19 20:57:39 +00:00
George Bosilca
63c5013fe6 After a OBJ_DESTRUCT a convertor has to be in a usable state. Read the comment for more informations.
This commit was SVN r7812.
2005-10-19 20:51:52 +00:00
George Bosilca
8987bcabe2 Remove the memcpy we can do it as we parse the datatypes in order to increase their references.
This commit was SVN r7811.
2005-10-19 20:51:11 +00:00
George Bosilca
6c6f17628f Remove a double OMPI_DECLSPEC from the definition of one of the predefined data-types.
This commit was SVN r7810.
2005-10-19 20:50:25 +00:00
Brian Barrett
de5e501519 Rather than hard spinning waiting for something to happen when doing shared
memory initialization, call opal_progress() to push any pending events
around and possibly yield the processor if nothing entertaining is happening.

This should probably go to the 1.0 branch.

This commit was SVN r7808.
2005-10-19 00:56:14 +00:00
George Bosilca
d2f831cd18 Construct the convertor attached to the receive request. This should happens only on the first allocation of a request object.
This commit was SVN r7807.
2005-10-18 21:53:05 +00:00
Brian Barrett
bcebd1b6b7 Fix a couple of places where headers didn't get installed correctly when
--with-devel-headers is given to configure:
  * allocator, rcache, and mpool were putting things in the wrong place
  * timer wasn't installing the inline implementations at all

This commit was SVN r7805.
2005-10-18 20:12:55 +00:00
Edgar Gabriel
3a7efaf4d9 fix for reduce and allreduce for an unsymmetric case
This commit was SVN r7802.
2005-10-18 19:20:48 +00:00
Edgar Gabriel
818b4af554 - reverting the logic in the hierarchy detection stuff. This can reduce the
number of collective operations and simplifies the logic significantly.
- introducing a special case if size of comm == 1, avoiding thus collective
 operations as well ( i.e. no need for hierarchies)
- fix for an unsymmetric case. Still to be tested.

This commit was SVN r7799.
2005-10-18 18:17:50 +00:00
Tim Woodall
b570c8cad4 need to specify a size, base address will match
This commit was SVN r7798.
2005-10-18 17:01:36 +00:00
Galen Shipman
4d2d39b0a6 intial checking of SRQ flow control support for mvapi
This commit was SVN r7796.
2005-10-18 14:55:11 +00:00
Jeff Squyres
f9974f72e0 construct/destruct convertor when requests are
constructed and allocated to free lists

This commit was SVN r7791.
2005-10-18 12:19:43 +00:00
Jeff Squyres
a459659a33 Print the string name of the return code
This commit was SVN r7789.
2005-10-17 20:47:44 +00:00
Galen Shipman
3efecaaeda convert openib btl to use new mca_param registration.. Also, change rr_buf_min
and rr_buf_max to rd_min and rd_max 

This commit was SVN r7786.
2005-10-17 20:00:34 +00:00
Tim Woodall
c944988b9e merge in changes from release branch - acquire/release send token for put/get
This commit was SVN r7784.
2005-10-17 18:59:28 +00:00
Jeff Squyres
89931ac05f - Correct typo in comment
- Add DIST_SUBDIRS to ompi/tools/Makefile.am

This commit was SVN r7780.
2005-10-17 11:55:55 +00:00
Brian Barrett
1302cb4072 The next in a long line of crazed build system changes from Brian. This was
originally suggested by Ralf Wildenhues, to try to speed autogen, configure,
and make (and possibly even make install).  Use automake's include directive
to drastically reduce the number of Makefile files (although the number of
Makefile.am files is the same - most are just included in a top-level
Makefile.am).  Also use an Automake SUBDIRs feature to eliminate the
dynamic-mca tree, which was no longer really needed.  This makes adding
a framework easier (since you don't have to remember the dynamic-mca
tree) and makes building faster (as make doesn't have to recurse through
the dynamic-mca tree)

This commit was SVN r7777.
2005-10-17 00:21:10 +00:00
George Bosilca
6e3c23ec3b Do not allow the use of the optimized path for predefined non contiguous datatypes (like MPI_SHORT_INT on most of the architectures).
This commit was SVN r7776.
2005-10-16 19:41:40 +00:00
Edgar Gabriel
7e45f64065 reduce has now been tested quite extensively for all (predefined) operations
and for all root nodes and passed all tests.
First cut on barrier (which from my perspective does not make sense from the
performance point of view) and on allreduce (which might make sense),

This commit was SVN r7774.
2005-10-15 22:24:44 +00:00
Edgar Gabriel
3fab9c628c switching the root and creating (if necessary) the new local leader sub-communicators seems to work as well. Thoroughly tested with bcast, not yet that exhaustivly tested for the reduction.
This commit was SVN r7773.
2005-10-15 21:13:44 +00:00
Edgar Gabriel
7d34770456 further bugfixes. The hierarchy detection works now as far as I can see (even in unsymmetric sitations). Bcast and reduce work as well. Still to test: the code which generates new local leader communicators, in case the root of the operation is not yet part of the lleader comm.
This commit was SVN r7772.
2005-10-15 19:36:54 +00:00
Edgar Gabriel
63554d245f further bugfixes
This commit was SVN r7771.
2005-10-15 18:44:57 +00:00
Edgar Gabriel
92c7b77cbc minor bug fixes
This commit was SVN r7770.
2005-10-15 18:32:40 +00:00
Edgar Gabriel
ba163c611c checkpoint before moving to a real cluster. Most of the recoding should be
done. This version also doesn't break ompi (at least if its not chosen :-) ).
New features compared to the version from last Thursday (where bcast and
reduce seemed to work in most scenarios):
- clearer internal infrastructure
- ability to handle all root processes with a (hopefully) minimal number of
local leader communicators. 

This commit was SVN r7769.
2005-10-15 17:04:01 +00:00
Jeff Squyres
e097ee635a Silence compiler warnings.
This commit was SVN r7768.
2005-10-14 22:06:25 +00:00
Jeff Squyres
237bd4c6cd Fix ompi_info -- cxx:bindings was somehow hard-coded to "yes" instead
of reflecting whether the C++ bindings were supported or not.

This commit was SVN r7766.
2005-10-14 20:07:05 +00:00
Jeff Squyres
f47c272986 Fix for the max-31-F90-symbol-limit problem: keep the interface names
the same (since those are both mandated by MPI and <31 characters),
but change some of the back-end subroutine names so that they are <31
characters and therefore obey the F90 standard.  Remove an outdated /
useless (and confusing) script.

This commit was SVN r7764.
2005-10-14 19:50:30 +00:00
Edgar Gabriel
2c909383bb abstracting the group_free operation into an internal routine (required
by some other components on ompi).

This commit was SVN r7763.
2005-10-14 18:51:20 +00:00
Edgar Gabriel
84c070fc0f get rid of the different modes how to store the colorarray for now. Might be
reintroduced later as an optimization.

This commit was SVN r7762.
2005-10-14 18:11:21 +00:00
Edgar Gabriel
6d14440972 checkpoint for moving again to another machine. major rewrite to clean
up internal interfaces in progress.

This commit was SVN r7761.
2005-10-14 17:41:44 +00:00
Edgar Gabriel
770aeaf97b modifications towards adding new local-leader communicators.
This commit was SVN r7760.
2005-10-14 12:18:29 +00:00
Graham Fagg
636b42afff handle non existant recv buf in reduce for non root processes
(basic allreduce does this for mpi_in_place case)

This commit was SVN r7759.
2005-10-14 00:00:37 +00:00
Graham Fagg
61b8218d76 MPI_IN_PLACE fix for reduce.
(actually a work around for an optimisation in the reduce for not saving ops on the first recv of each segment)
Minor change in topo.

This commit was SVN r7758.
2005-10-13 23:38:21 +00:00
Edgar Gabriel
48f2563b4c checkpoint. Moving to another machine.
This commit was SVN r7757.
2005-10-13 20:04:26 +00:00
Edgar Gabriel
4b05359b16 minor fixes when freeing the component
This commit was SVN r7756.
2005-10-13 18:22:16 +00:00
Edgar Gabriel
0a5a346bbb first cut on the reduce operation.
This commit was SVN r7755.
2005-10-13 17:58:13 +00:00
Edgar Gabriel
30af775d40 further fixes. The first hierarchical MPI_Bcast works! Its just ~ 100 times slower then basic at the moment :-)
This commit was SVN r7754.
2005-10-13 17:34:42 +00:00
Edgar Gabriel
460b5cb840 further corrections to the hierarchy detection algorithms. It seems to work now as far as my tests show...
This commit was SVN r7753.
2005-10-13 16:21:13 +00:00
Edgar Gabriel
f5d16419b2 fix in the logic regarding protocol detection.
This commit was SVN r7749.
2005-10-13 15:07:35 +00:00
Edgar Gabriel
5d7fbd9d2e minor change in bml_r2_add_procs: the memory for the bml_endpoints structure
has to be allocated outside of the routine. Thus, the update version of pml/ob1/oml_ob1.c

This commit was SVN r7739.
2005-10-12 20:59:25 +00:00
Edgar Gabriel
3e5ad3e681 Updates
This commit was SVN r7738.
2005-10-12 20:56:29 +00:00
Tim Woodall
22f460bdc5 merge in changes from release branch
This commit was SVN r7737.
2005-10-12 20:24:43 +00:00
Tim Woodall
6da9561ea8 merge in correction from v1.0
This commit was SVN r7732.
2005-10-12 16:40:52 +00:00
Tim Woodall
d859855dea merge in changes from 1.0
This commit was SVN r7728.
2005-10-12 15:54:35 +00:00
Jeff Squyres
727a2cf8b2 Correct a few #if issues that George identified in a code review
This commit was SVN r7724.
2005-10-12 13:19:46 +00:00
Jeff Squyres
c568936a7c Add support for MPI_OP_SUM, PROD, REPLACE with MPI_DOUBLE_COMPLEX.
Need to consult with George -- might also need to add support for
complex types on floats or long doubles...

This commit was SVN r7716.
2005-10-12 02:31:28 +00:00
George Bosilca
cb7b401ca8 Correct the send-recv operation.
This commit was SVN r7709.
2005-10-11 22:41:08 +00:00
Edgar Gabriel
25518b63c5 first version of coll_hierarch which does not crash the rest of the
library as long as its not selected :-)

This commit was SVN r7707.
2005-10-11 22:05:24 +00:00
Josh Hursey
ef51608a81 fix compiler warning
This commit was SVN r7706.
2005-10-11 22:03:21 +00:00
Edgar Gabriel
0675c22dab updating with Jeff's help to the recent autogen/configure system
This commit was SVN r7705.
2005-10-11 21:50:16 +00:00
Edgar Gabriel
7b07dbc163 another round of fixes. Unfortunatly, I also have to provide a trivial
version of reduce and gather to make all this work....

This commit was SVN r7702.
2005-10-11 21:26:07 +00:00
Tim Woodall
4a71621410 merge in scheduling changes from release branch
This commit was SVN r7699.
2005-10-11 20:41:51 +00:00
Edgar Gabriel
c8adc2e65e coding around the collective operations
This commit was SVN r7698.
2005-10-11 20:34:17 +00:00
Edgar Gabriel
083d0b9630 Checkpoint: most of the coding should be done for the basic
infrastructure.

This commit was SVN r7696.
2005-10-11 19:45:21 +00:00
Graham Fagg
607bdf51b6 Last Cleanup BEFORE adding last two methods and final cross over points.
- new mca param calls
- move printfs to OPAL_OUTPUT

This commit was SVN r7692.
2005-10-11 18:51:03 +00:00
Edgar Gabriel
b42d4ac780 Checkpoint:
- update the hierarch stuff to use btl's instead of ptl's
- start the new logic regarding how to handle local leader communicators

This commit was SVN r7691.
2005-10-11 17:29:59 +00:00
Galen Shipman
23cbac25c8 lower default free list sizes..
This commit was SVN r7676.
2005-10-09 18:15:12 +00:00
Galen Shipman
fb19cc4177 compiler warning fixes..
This commit was SVN r7661.
2005-10-07 17:38:34 +00:00
Jeff Squyres
b22fab2826 Fix for a bug Galen noticed yesterday -- make the shared memory only
be allocated the first time a sm coll is selected for a communicator,
not before.

This commit was SVN r7647.
2005-10-06 13:17:27 +00:00
George Bosilca
1fe18814da Decrease the default length for the first fragment.
This commit was SVN r7643.
2005-10-06 00:05:01 +00:00
George Bosilca
0f04132b13 mx_connect in the MX documentation is supposed to take a timeout in seconds. However, in real life it seems that the timeout should be in micro-second.
This commit was SVN r7642.
2005-10-06 00:04:27 +00:00
Brian Barrett
b7ef094766 * the cid in the header is only 16 bits, so limit our max cid to what can fit in there.
This commit was SVN r7639.
2005-10-05 15:43:28 +00:00
Jeff Squyres
83b5a675f9 Don't automatically take the first entry off the selected component
list; be sure to check its priority against the basic component and
take the one with the higher priority.

This commit was SVN r7621.
2005-10-04 17:09:45 +00:00
George Bosilca
967cd1be32 Make the datatype compile on solaris.
Remove some warnings ...

This commit was SVN r7619.
2005-10-04 15:45:18 +00:00
Jeff Squyres
b17c4334c4 - Remove all vestigates of using the built-in mcb_tree from the
reduce_inorder() function -- we don't use the tree at all.
- Add more relevant "volatile"'s for the control buffers in the
  fragment mpool (and associated casts where necessary)

This commit was SVN r7616.
2005-10-04 14:52:59 +00:00
George Bosilca
9a67831ba3 Alway call the memory allocation function with the correct type for the first argument. The problem is
that on some OSes the iovec struct is not POSIX complian, the iov_len is not a size_t but simply an int.
This patch, add a local variable (type size_t) to use with the memory allocation function, and then put
back the value in the iov_len field.

This commit was SVN r7615.
2005-10-04 14:44:59 +00:00
George Bosilca
059c802094 Correct a small typo
This commit was SVN r7614.
2005-10-04 14:42:37 +00:00
Jeff Squyres
94bab558dd Put in a check to ensure the root is valid (all other rooted
operations have this; we somehow missed this for intracomms on reduce,
and it bit me this morning ;-) )

This commit was SVN r7612.
2005-10-04 14:38:17 +00:00
Tim Woodall
3b4a134a24 - removed unused define
- correct free to release registration rather than retain it

This commit was SVN r7611.
2005-10-04 14:33:26 +00:00
George Bosilca
f8355ec104 Cast the right side member to void* before assignment.
This commit was SVN r7608.
2005-10-04 12:37:23 +00:00
George Bosilca
6b3d02b514 Warning cleanups. On some OSes the iov_base member of the iovec structure is defined as an void * when
on others as an char*. Thus the right side of all assignment should be explicitly casted to an void* in
order to avoid any casting complaints from the compilers.

This commit was SVN r7607.
2005-10-04 12:36:07 +00:00
George Bosilca
3453a6c0e9 Remove some compiler warnings about unused variables
Correctly define the 64 bits constants.
Some minor cleanups.

This commit was SVN r7606.
2005-10-04 12:29:51 +00:00
George Bosilca
492c0e59dc Correct the casting type and remove some useless output (already commented out).
This commit was SVN r7605.
2005-10-04 12:28:47 +00:00
Tim Woodall
c05ef28f6e - added routine to ompi_pointer_array to remove array contents
- corrected memory hook callback to catch all allocations (need to optimize this)
- don't attempt to consolidate allocations

This commit was SVN r7600.
2005-10-03 23:29:26 +00:00
Jeff Squyres
c7fe54ba44 - Remove some silly compiler warnings
- Move the "process 0" logic out of the main loop in reduce to make
  the code a bit less complex (at the price of slight code
  duplication, but it iss now significantly easier to read)
- Fix problem with uniquenes guarantee in the bootstrap mpool -- using
  the CID alone was not sufficient enough to guarantee uniquenes; now
  use (CID, rank 0 process name) tuple to check for uniqueness
- Made a few debugging help changes in coll_sm.h; especially helps
  debugging on uniprocessors

This commit was SVN r7599.
2005-10-03 21:34:58 +00:00
Jeff Squyres
2cedfeec53 - Eliminate some unused base globals
- Move one base global to the basic component and make it an MCA
  parameter 
- Convert the basic component to use the new MCA param API

This commit was SVN r7598.
2005-10-03 21:07:42 +00:00
Jeff Squyres
57fb96b018 Clarification of a help message
This commit was SVN r7597.
2005-10-03 21:06:13 +00:00
Jeff Squyres
ab099fa8cb Re-indent; real commit with some changes coming shortly.
This commit was SVN r7596.
2005-10-03 19:56:39 +00:00
Galen Shipman
eefe0fd04a fix threaded compile
fix misc warnings 
cleanup posting of receive descriptors 
comment why we retain before deregister in rcache_rb_mru.c 

This commit was SVN r7595.
2005-10-03 16:35:12 +00:00
Galen Shipman
f46548e691 Add SRQ support to OpenIB btl, removed old mca param - not used..
This commit was SVN r7585.
2005-10-02 18:58:57 +00:00
Jeff Squyres
10064df0e9 Remove compiler warning
This commit was SVN r7578.
2005-10-02 10:43:53 +00:00
Jeff Squyres
84feccd3d5 This is something I forgot to commit from long ago -- already
discussed and cleared with Edgar.

Ensure that only processes who will be in the new communicator call
the coll selection function.  It is pointless (and Bad in some cases)
for processes who are not in the new communicator to try to select a
coll module for the new communicator.

This commit was SVN r7573.
2005-10-01 11:57:17 +00:00
Jeff Squyres
37fc944b01 Use the right number of segments per in-use flag when calculating
offsets.

This commit was SVN r7571.
2005-09-30 23:12:23 +00:00
Galen Shipman
67d38b7896 Add multi-nic support to openib
Fix connection establishment race in openib 
Other misc 

This commit was SVN r7570.
2005-09-30 22:58:09 +00:00
Jeff Squyres
934caaf449 Fix at least one segv; use the right number of segments (i.e., the
number o segments in the fragment pool, not in the bootstrap pool)

This commit was SVN r7565.
2005-09-30 18:01:15 +00:00
Brian Barrett
db872a0fbb * check that return from ibv_get_devices isn't NULL before calling dlist_start().
On thor, if IB is down, we get NULL back from ibv_get_devices(), which then
  caused segfaults in dlist_start().
* Pretty-print error message if no HCAs found

This commit was SVN r7557.
2005-09-30 14:58:59 +00:00
Jeff Squyres
fcef1774d5 Per advice from Ralf W., change the pkgdata declarations in
Makefile.am's to be a *slightly* more correct (and, more importantly,
less error-prone) construct.

This commit was SVN r7554.
2005-09-30 13:32:39 +00:00
Jeff Squyres
80b7deb4d7 Add in EXTRA_DIST to get helpfile in tarballs
This commit was SVN r7553.
2005-09-30 10:25:04 +00:00
Brian Barrett
7b20370306 * pretty-print an error message if a btl component loads but can't find
any NICs to use
* Make mvapi, gm, and mx components all publish information, even if there
  are no NICs available so that modex_recv doesn't hang.  If there are no
  NICs available, don't set the reachable bit, but don't do anything
  to fail.  This unfortunately doesn't cover the hangs that will result if
  different procs load different sets of components, but it's a start

This commit was SVN r7550.
2005-09-30 04:39:44 +00:00
Brian Barrett
a77c908496 * the last of the tuning params for portals
This commit was SVN r7548.
2005-09-30 04:05:31 +00:00
Galen Shipman
8239e635b9 fix misc warnings, cleanup macro..
This commit was SVN r7547.
2005-09-30 03:13:51 +00:00
Galen Shipman
05e6e51fec re-reg from min of bases and max of bounds
add byte counting for total registered memory 

This commit was SVN r7546.
2005-09-29 21:28:54 +00:00
Jeff Squyres
bc181d7130 Remove the .ompi_ignore so that everyone starts compiling this, but
lower the default priority to 0 so that it's not active unless you
specifically ask for it (this component needs more testing by people
other than me before we unleash it on the public).

This commit was SVN r7545.
2005-09-29 18:05:47 +00:00
Jeff Squyres
d4b7618db7 Comment out what seems to be a debugging output. Will confirm with
George.

This commit was SVN r7544.
2005-09-29 16:39:27 +00:00
Brian Barrett
997644af31 * There are now two forms of ibv_create_cq, one with 3 params and one with 5.
Try to detect which form this version of Open IB uses, defaulting to the 5
  version if we can't figure it out (the new version has 5 params)
* Only add -lcm if it exists on the system - some versions of Open IB
  apparently don't need it.

This commit was SVN r7542.
2005-09-29 13:35:57 +00:00
Josh Hursey
e825b4522f Upon further investigation the fix in r7537 was an anomoly of zero'ing out the
bits to expose the low bits being set. We were casting from a size_t to a void*
which is not good when working with big endian machines.

This fix makes MPI 2 dynamics work on PPC 64 (tested with a Linux OS).

This commit was SVN r7538.

The following SVN revision numbers were found above:
  r7537 --> open-mpi/ompi@fd45714c03
2005-09-28 23:50:42 +00:00
Josh Hursey
fd45714c03 For some reason we have to initialize this variable or bad things happen in the
comm->c_coll.coll_bcast of the rnamebuflen.

This fixes the threaded MPI 2 Dynamics stuff. Should be working great now! Yay!

This commit was SVN r7537.
2005-09-28 22:30:41 +00:00
Galen Shipman
3ded88a3c0 use addr +size -1 instead of base->addr as base->addr is down_aligned.
This commit was SVN r7536.
2005-09-28 20:19:33 +00:00
Galen Shipman
26a74d42fa release, not retain on gm_free
This commit was SVN r7535.
2005-09-28 20:18:52 +00:00
Edgar Gabriel
67dd52efb1 making the allreduce and reduce_scatter tests pass as well
This commit was SVN r7532.
2005-09-28 15:12:05 +00:00
Josh Hursey
75419313f7 check the return code and do something reasonable, instead of progressing and hanging on error
This commit was SVN r7531.
2005-09-28 06:13:51 +00:00
Galen Shipman
c1f5543f62 need to call mpool_release on all registrations obtained in the pml.
sanity checks 

This commit was SVN r7530.
2005-09-28 04:49:40 +00:00
Galen Shipman
b9b78f8f5d modify rcache_rb to find registrations in the middle of a base and bound
This commit was SVN r7528.
2005-09-28 02:11:35 +00:00
Edgar Gabriel
dbbbd416df fixing MPI_IN_PLACE for the log-reduce algorithm.
This commit was SVN r7526.
2005-09-27 21:51:55 +00:00
Galen Shipman
0fc17cedee change order of ops on register
This commit was SVN r7525.
2005-09-27 21:43:41 +00:00
Jeff Squyres
285ded5655 - Ensure to have !initialized || finalized test *first*
- If we have an NS error, don't return an error -- this function's
  purpose is to abort :-)
- s/abort()/exit(1)/ so that we don't drop massive corefiles

This commit was SVN r7524.
2005-09-27 20:26:38 +00:00
Galen Shipman
09e67ce4fd fix off by one on up_align_addr, use base and bound instead of base_align and
bound_align.. 

This commit was SVN r7521.
2005-09-27 18:10:44 +00:00
Galen Shipman
af04b3e1ab fix warnings..
This commit was SVN r7515.
2005-09-27 14:23:51 +00:00
Brian Barrett
80ac5c2efd * there are now two upcoming points where we want to release a version with
a random string of characters as part of the version number (the really
  soon to happen 1.0lanl release and the 1.1sc2005 release that we've
  talked about).  So rather than having alpha and beta fields that must
  be numeric values, have a general field that can be any alphanumeric
  value.

This commit was SVN r7511.
2005-09-27 02:06:05 +00:00
Galen Shipman
3c97b3f722 Modified the registration to include a base_align and bound_align for
searching the tree. Modified the memory callback to search the tree at each
page boundary for registrations. This is necessary as an application may
malloc memory and send out of any portion of that memory, even discontiguous
regions. 

This commit was SVN r7510.
2005-09-27 02:01:21 +00:00
Brian Barrett
d9e80d8f2a * increase size of event queue for receives - it was too small to be useful
on a reasonably sized machine
* if no mpool exists, don't try to malloc out an array of 0 bytes

This commit was SVN r7507.
2005-09-25 17:04:03 +00:00
Galen Shipman
384c472c94 reset ompi_pointer_array in mca_rcache_rb_find otherwise you might use an old
registration by accident.. 

This commit was SVN r7506.
2005-09-24 20:48:14 +00:00
Galen Shipman
02ce7a176e had that backwards..
This commit was SVN r7504.
2005-09-24 16:58:07 +00:00
Galen Shipman
d1246be47e should be strictly > or <
This commit was SVN r7503.
2005-09-24 16:45:34 +00:00
Galen Shipman
c53d51778a fix for warnings..
This commit was SVN r7501.
2005-09-24 15:08:28 +00:00
Galen Shipman
9fe5844071 decrement ref count on removal of registration from mru and tree.
add misc asserts to check for proper reference counting. 

ugly hack 1 -- use mallopt to never release memory ala sbrk - this is
commented out in mca_btl_mvapi_component_init

ugly hack 2 -- test registrations comming out of the tree via rcache_find, for
an unknown reason the tree is returning registrations where the address is not
within the base or bound of the registration. If this happens, we return
NULL. 

comment out code to enable mem hooks if leave_pinned is set, note we can do
this via an mca param and will default it to leave_pinned with mem_hooks when
we iron out these issues. 

I am adding a unit test for the rcache. Note that we have a unit test for the
rb tree but the compare function is significantly different than that used for
registrations. After we have tracked down the issues with rcache_rb we will
remove the above hacks. 

This commit was SVN r7499.
2005-09-24 00:24:49 +00:00
Brian Barrett
50dc5499b4 * fix some remaining --with-btl-portals configure issues
This commit was SVN r7498.
2005-09-24 00:11:40 +00:00
Brian Barrett
0d68728b94 * add some more debugging output for send fragment issue to figure out why
Red Storm is complaining about invalid memory pointer (need to go back
  to Linux and look at this with valgrind)
* Turn off send in place for now, so I can run the tests on RS and see if
  everything else is ok

This commit was SVN r7497.
2005-09-23 19:30:54 +00:00
Brian Barrett
07b0b8c943 * add some useful debugging output
* fix dumb bug in btl_portals_get where I using the dest descriptor key instead
  of the source descriptor key for the match bits, resulting in a PtlGet() with
  the wrong match bits

This commit was SVN r7496.
2005-09-23 15:30:18 +00:00
Tim Woodall
604c9d1002 bump the default cache size
This commit was SVN r7492.
2005-09-22 19:37:42 +00:00
Tim Woodall
848f12e7fd if mpi_leave_pinned is enabled - force malloc hooks
This commit was SVN r7491.
2005-09-22 17:27:56 +00:00
Tim Woodall
aceab46c5f use MPI_Alloc_mem/MPI_Free_mem for internally allocated buffers
This commit was SVN r7487.
2005-09-22 16:43:17 +00:00
Tim Woodall
147716c249 added hostname to error output
This commit was SVN r7486.
2005-09-22 16:41:34 +00:00
Tim Woodall
b404581293 local variables/objects (regs) must be initialized/constructed
This commit was SVN r7485.
2005-09-22 16:18:23 +00:00
Tim Woodall
7acf0a6bdb corrections for MPI_Free_mem
This commit was SVN r7481.
2005-09-22 15:47:33 +00:00
Andrew Friedley
555ae37255 Add lib{opal,orte,mpi}.la to appropriate LIBADD's, some whitespace cleanup as well.
This commit was SVN r7477.
2005-09-22 12:28:54 +00:00
Tim Woodall
da1e4a1292 - since we append new registrations to the end of the list,
need to remove old ones from the front
- call deregister to actually remove items from the cache/mru
list and deregister the memory (if not being used)

This commit was SVN r7476.
2005-09-21 23:25:16 +00:00
Tim Woodall
9791c066e8 dont attempt to pin the receive buffer if data has
already been received

This commit was SVN r7475.
2005-09-21 23:23:47 +00:00
Tim Woodall
1b7b220089 cleanup refcount
This commit was SVN r7462.
2005-09-21 20:04:56 +00:00
Tim Woodall
a74ca0062a reductions to initial memory footprint
This commit was SVN r7455.
2005-09-21 19:10:56 +00:00
Galen Shipman
4296e723c9 default free_lists to smaller size..
This commit was SVN r7454.
2005-09-21 18:55:07 +00:00
Galen Shipman
96ab5a6bd3 we can be in WAITING_ACK state without a race if the OOB ack is "slower" than
the scheduling of queued IB send operations. 

This commit was SVN r7452.
2005-09-21 16:47:08 +00:00
Tim Woodall
782e5b21cc cleanup
This commit was SVN r7451.
2005-09-21 15:34:45 +00:00
Tim Woodall
a49a442fe4 cleanup refcount logic
This commit was SVN r7450.
2005-09-21 15:32:27 +00:00
Tim Woodall
0ee34051f8 debug asserts
This commit was SVN r7449.
2005-09-21 15:30:17 +00:00
Tim Woodall
1b73d3856e possible race condition - set endpoint state before sending connect ack
This commit was SVN r7448.
2005-09-20 21:03:55 +00:00
David Daniel
e4985c2a07 Moving totalview spin to the very end of mpi_init
This commit was SVN r7444.
2005-09-20 15:22:15 +00:00
Brian Barrett
fd9901f683 * shell of a portals PML, properly ompi_ignored for most of the world...
This commit was SVN r7437.
2005-09-20 08:07:08 +00:00
Brian Barrett
d81726833e * Add memory barriers for shared memory. Rich and I think we got them
all and the Intel tests pass slightly oversubscribed.

This commit was SVN r7431.
2005-09-19 16:28:25 +00:00
Tim Woodall
aeb5bc3f57 still need to cleanup/revise the template for mpool changes
This commit was SVN r7425.
2005-09-19 14:34:24 +00:00
George Bosilca
b70230858b Correct the misnaming problem in the GM PTL.
This commit was SVN r7424.
2005-09-19 10:34:06 +00:00
Galen Shipman
6499eb3976 init the return code..
This commit was SVN r7423.
2005-09-18 14:25:30 +00:00
George Bosilca
97673b45d1 Remove the last bad symbol from the GM PTL.
This commit was SVN r7422.
2005-09-18 12:52:37 +00:00
George Bosilca
b5cb27c006 The self should use self named files.
This commit was SVN r7421.
2005-09-18 12:37:15 +00:00
George Bosilca
a7db1763e2 Cleanups ...
This commit was SVN r7420.
2005-09-18 12:34:29 +00:00
Jeff Squyres
d67c31f238 Remove useless compiler warnings.
This commit was SVN r7418.
2005-09-17 10:54:48 +00:00
Jeff Squyres
f9a1e14f65 Per suggestion from our friendly Libtool developer friends, add proper
dependencies for liborte and libompi (i.e., make liborte depend on
libopal, and make libmpi depend on liborte)

This commit was SVN r7417.
2005-09-17 10:45:46 +00:00
Galen Shipman
b8cb6e1c64 modified mpool module to contain flags - used to determine if the mpool will
be used in MPI_Alloc_mem operations. Note that we found an interesting bug in
which if memory was allocated by the sm mpool (via mmap) and then registered
via the mvapi mpool, the registration would fail on certain systems. 

Added mca param mpool_base_use_mem_hooks, set to 1 to enable the memory hooks
so that memory is deregistered if the user frees it behind our back. This is
only useful if the mca param mpi_leave_pinned is also set to 1. Otherwise all
registrations are deregistered within the MPI library or via
MPI_Free_buf. After testing we should probably set both mpi_leave_pinned and
mpool_base_use_mem_hooks to default to 1. 

This commit was SVN r7415.
2005-09-16 22:22:03 +00:00
Brian Barrett
c87babb565 * if start_rank == end_rank, there doesn't seem to be a requirement that
stride == 1, and the Intel tests explicitly test the case with
  strides != 1, expecting them to work

This commit was SVN r7411.
2005-09-16 18:44:21 +00:00
Galen Shipman
808b2c1c53 threaded build fix for btl_gm..
This commit was SVN r7409.
2005-09-16 17:18:15 +00:00
Jeff Squyres
2b82224a4f Remove superflous ; (actually causes an error in some cases)
This commit was SVN r7405.
2005-09-16 12:27:25 +00:00
Jeff Squyres
10d02b2110 Make sure to copy the right amount out of the temp buffer.
This commit was SVN r7400.
2005-09-15 22:06:36 +00:00
Galen Shipman
22dac4983c fix mca_mpool_base_free to use new rcache
This commit was SVN r7398.
2005-09-15 21:09:07 +00:00
Jeff Squyres
15d0a95202 - Remove extra whitespace from Makefile.am's from when we removed
Makefile.options
- Sample in each of the three projects of how to link againt the
  relevant libraries so that when components are loaded into a parent
  process' space, we don't rely on the libopal/liborte/libmpi symbols
  being in the parent's public symbol namespace -- instead,
  dynamically link to the relevant libraries, allowing the dynamic
  linker to pull those libraries in at run-time, if needed

This commit was SVN r7397.
2005-09-15 20:56:18 +00:00
Jeff Squyres
46bed32fc6 While double-checking the order of SUBDIRS (ensuring dynamic-mca is
after "."), remove some whitespace left over from when we removed
Makefile.options.

This commit was SVN r7393.
2005-09-15 20:46:29 +00:00
Tim Woodall
9b96ecf9b1 correction for rdma case - currently rdma entire message
This commit was SVN r7392.
2005-09-15 19:36:56 +00:00
Jeff Squyres
3ecfe02b83 - Properly handle MPI_IN_PLACE
- Return MPI_ERR_ARG, not EINVAL

This commit was SVN r7391.
2005-09-15 19:33:54 +00:00