1
1

964 Коммитов

Автор SHA1 Сообщение Дата
Galen Shipman
9165882c07 fixes for failover...
This commit was SVN r9998.
2006-05-20 02:39:05 +00:00
Gleb Natapov
1c1b87a9f1 init mutex before use.
This commit was SVN r9963.
2006-05-18 09:35:11 +00:00
Jeff Squyres
15758d5f29 Fix AC_DEFINE to match what it's supposed to be defining
This commit was SVN r9952.
2006-05-17 03:26:43 +00:00
Galen Shipman
deb2254c91 1. mpool_free changes to allow null registrations
2. fix for MPI_Free_mem, was calling deregister but never called mpool_free.. so
we leaked memory. Still an open issue here though, if the memory is alloc'd
and the mpool doesn't create and cache a registration, we will never find the
mpool to free with. 

This commit was SVN r9944.
2006-05-16 22:04:31 +00:00
Jeff Squyres
7b59847765 Ensure that endpoint->endpoint_addr is not NULL before trying to
derefence through it.  It is legal for endpoint_addr to be NULL in the
destructor because if btl_tcp_add_procs() -> btl_tcp_proc_insert()
returns UNREACH, then endpoint_addr will be NULL and we'll OBJ_RELEASE
it.

This commit was SVN r9940.
2006-05-16 19:01:08 +00:00
Jeff Squyres
e24377a89c Back out a pair of commits from George from last week because they
apparently don't work properly: r9869, r9868 (sm btl alignment issues)

This commit was SVN r9936.

The following SVN revision numbers were found above:
  r9868 --> open-mpi/ompi@9b985c3216
  r9869 --> open-mpi/ompi@adedf511fb
2006-05-16 16:48:43 +00:00
Sven Stork
da7ad0e8b8 - update function name inside debug statement
This commit was SVN r9933.
2006-05-16 14:33:41 +00:00
Brian Barrett
dcc6b47fa2 * put rdma operations in the send event queue instead of receive because it's
easier to do event accounting that way
* greatly increase receive event and buffer sizes.  We're still about half
  of what Cray defaults to, so I don't feel bad about the increases
* Implement a pre-pinning optimization for eager fragments - will be
  pinned on first use and left pinned for the life of the fragment
* Since we can't have two receive frag callbacks fired at the same time,
  don't have receive free list - just keep one receive fragment in the
  module.  Saves a big free list and all that interaction.

This commit was SVN r9915.
2006-05-14 04:23:26 +00:00
Brian Barrett
db03ca0cc0 rip out a bunch of code that didn't work and really sucked and was only there
to try to get some numbers that I couldn't actually get.  So back to the
restart point.

This commit was SVN r9914.
2006-05-14 00:59:40 +00:00
Brian Barrett
f2a6e63d82 Fix for the double iWrite problem Edgar found with ROMIO, plus some other
things I found:
  - Locking should prevent it from happening (I think), but there was a 
    race condition in the component progress -- a callback could be
    triggered that would free the request before it was off the outstanding
    requests list.
  - When pulling a request off the component free list, make sure to
    reinitialize the free_called state on the IO request.  This was
    what was causing Edgar's failures
  - In the request cleanup code, pull the request out of the per-
    component free list before returning to the free list.  This
    probably would cause asserts to fire, although it looks like
    I wrote the loops such that it would have been memory safe if
    the asserts didn't fire.  Not really sure why I did that, but
    let's try it again...

This should go to the v1.0 and v1.1 branches.

This commit was SVN r9913.
2006-05-13 02:30:40 +00:00
Jeff Squyres
a6d52ceed1 Minor correction in use of mca param API; otherwise the param is not found.
This commit was SVN r9903.
2006-05-11 22:12:29 +00:00
Andrew Friedley
4c3aa05c83 uDAPL has an expects memory for enumerating interface adapters in a really
weird way - fix up to do things 'properly'.

Add my sandia username to the unignore.

This commit was SVN r9879.
2006-05-10 19:50:30 +00:00
George Bosilca
adedf511fb Remove the printf that I unfortunately commit.
This commit was SVN r9869.
2006-05-10 00:02:54 +00:00
George Bosilca
9b985c3216 Force the useful data to be aligned on special boundary. It is 32 bits
right now. Some testing on large NUMA machines should be done in order
to make sure that we need to export this variable out to the MCA layer.

This commit was SVN r9868.
2006-05-09 21:46:10 +00:00
George Bosilca
a386fccccc Increase the default limits for the SM BTL. These new
values allow better performances on all the clusters
I was able to test.

This commit was SVN r9867.
2006-05-09 21:44:24 +00:00
Brian Barrett
91086cf2a4 * we want to unlink match entries when we unlink memory descriptors, but
I want to be lazy and not do it by hand, so set the match entries to
  PTL_UNLINK.

This commit was SVN r9861.
2006-05-09 14:20:51 +00:00
Gleb Natapov
0c34d5c9e6 fix endpoint matching in on demand connection establishment. This fix is in mvapi btl already.
This commit was SVN r9855.
2006-05-09 12:12:52 +00:00
Brian Barrett
1d337831d0 Fixes for more issues found by Dries Kimpe:
- We had a bad conditional choice, such that asking for pvfs2 would
    result in pvfs trying to build as well, which was going to fail.
  - We didn't try to link in the libray for PVFS2's adio component.
  - We were clobbering romio_flags, so it was impossible to pass
    flags to romio (like the selection of filesystems)

This commit was SVN r9854.
2006-05-09 09:30:09 +00:00
Galen Shipman
c992eeb1f3 don't need to decrement memory registered twice,, this is done in
mru_delete.. 

This commit was SVN r9853.
2006-05-08 17:42:34 +00:00
Brian Barrett
7dddc6d54c Define the constants needed by ROMIO to activate support code for
DARRAY / SUBARRAY.

This commit was SVN r9851.
2006-05-08 16:33:31 +00:00
Brian Barrett
462849d88c Fix two issues reported by Dries Kimpe:
- LDFLAGS set at the top level of Open MPI were not passed to the 
   ROMIO configure script
 - If ROMIO was explicitly required (with --enable-io-romio) and
   not able to be built, abort OMPI's configure script.

This needs to go to the v1.0 and v1.1 branches.

This commit was SVN r9845.
2006-05-08 13:13:32 +00:00
Brian Barrett
8397a1d71f still running into issues, but...
- change MASK behavior for tags - we need the upper bit to be whether
  the tag is reseved or not.  MPI_ANY_TAG should not pull off any
  reserved tag communication
- some other random debugging output to try to get some idea what is
  spewing out of here.

This commit was SVN r9844.
2006-05-08 09:23:09 +00:00
George Bosilca
e658557d52 Move the convertor creation out of th critical path. If we expect a
message from a known peer (not MPI_ANY_SOURCE) then we can attach the
remote proc and initialize the convertor as soon as we know the data-type,
and the count (so basically in the _INIT macro). If it's not the case, then
create them in the _MATCHED macro (as in the original version). Of course,
beforeinitializing the convertor we check that there will be some data
in the message.

This commit, plus the convertor improvements from few days ago, lower the
latency for my test case environment (mvapi) by 0.1 microseconds. The convertor
now is as slim as it can be, I don't think there is anything else to
remove/improve. 

This commit was SVN r9843.
2006-05-07 21:03:12 +00:00
George Bosilca
a7542824ed Generic length computation (moved from the endpoint.h).
This commit was SVN r9842.
2006-05-07 20:54:44 +00:00
George Bosilca
569b88e093 The endpoint include is not required.
This commit was SVN r9841.
2006-05-07 20:52:55 +00:00
George Bosilca
e63c1dc242 The last commit wans't supposed to bring this function in. It's not yet
ready for primetime...

This commit was SVN r9840.
2006-05-07 20:51:43 +00:00
George Bosilca
33aa65f894 Remove useless include.
This commit was SVN r9839.
2006-05-07 20:49:45 +00:00
Galen Shipman
a4c9db0c18 decrease the total bytes in the rcache when a registration is deleted from the
cache. 

This commit was SVN r9837.
2006-05-07 01:15:33 +00:00
Rainer Keller
0f9b10ff8e - Update test dup MPI_COMM_WORLD -- so that we may
have additional Barriers for output.

This commit was SVN r9831.
2006-05-05 07:42:33 +00:00
Rainer Keller
71d328c086 - Add the PERUSE_COMM_REQ_XFER_CONTINUE for recv.
This commit was SVN r9820.
2006-05-04 19:31:33 +00:00
Tim Woodall
161e54e6c8 finalize/cleanup failed btl
This commit was SVN r9819.
2006-05-04 18:48:45 +00:00
Tim Woodall
d8ff8010f3 track wether the vfrag is being retransmitted
This commit was SVN r9817.
2006-05-04 17:30:58 +00:00
Tim Woodall
1b26caa95b first cut at btl failover - seems to be working for simple test case
This commit was SVN r9816.
2006-05-04 16:16:26 +00:00
Tim Woodall
350d5b1713 change hardcoded values into mca params
This commit was SVN r9815.
2006-05-04 15:20:18 +00:00
Tim Woodall
fdd622544b added optional copy routine to allow "derived" class
of mca_bml_base_endpoint to copy state if an endpoint
is updated (e.g. btl deleted/added)

This commit was SVN r9814.
2006-05-04 15:19:12 +00:00
Brian Barrett
d101e91b97 * fix matching logic - since tag might be negative, need to mask the proper bits
or the bit-wise or changes all the high bits, which is bad
* push convertor creation to init to save a bit of time
* make debugging use macros so that it can go bye-bye

This commit was SVN r9810.
2006-05-04 13:48:32 +00:00
George Bosilca
bdecdc8d41 Cleanup the MX BTL. Remove all mpool related code as there will never be a MX mpool.
This commit was SVN r9808.
2006-05-04 06:55:45 +00:00
George Bosilca
c5209aad93 The return value is random. Let's return something that make sense.
This commit was SVN r9805.
2006-05-03 18:17:00 +00:00
Brian Barrett
6db0f2a027 * couple of corrections to compile on Red Storm
This commit was SVN r9801.
2006-05-03 13:13:59 +00:00
Brian Barrett
4add400f7d * properly start with the memory descriptor inactive
This commit was SVN r9787.
2006-05-01 20:23:38 +00:00
Brian Barrett
5f939c53be * first take at send / receive for a poratls pml (still really dumb and simple)
This commit was SVN r9786.
2006-05-01 20:03:49 +00:00
Brian Barrett
56f48357b3 * don't try to register callback at init time (will do at window creation time
anyway), so that we can run without ob1

This commit was SVN r9785.
2006-05-01 20:03:03 +00:00
Brian Barrett
4256705ffb * rename irecv, isend, and iprobe files to recv, send, and probe
This commit was SVN r9780.
2006-04-29 22:06:21 +00:00
Brian Barrett
315a889247 Try to get the Portals PML going again, just to get some data for the Cray
paper.  This is just the shell, for checkpoint.  Changes:

* Fix copyrights
* remove cancel code and ptl references
* add dump command 

This commit was SVN r9779.
2006-04-29 22:05:20 +00:00
Tim Woodall
02d991532f interface to post a callback for notification of change to modex data
This commit was SVN r9753.
2006-04-27 16:15:35 +00:00
Tim Woodall
4fd2a71b6c removed debug code - free list implementation has changed
This commit was SVN r9750.
2006-04-27 15:34:12 +00:00
Brian Barrett
9cab1bb54a * re-enable the eager fragment throttling, this time with the proper threshold value for when
the memory descriptor is closing itself, so that it actually works properly ;).  I think I
  was just getting lucky and not sending enough short messages with the reference impl.

This commit was SVN r9748.
2006-04-27 14:13:52 +00:00
Brian Barrett
66d1d3b83f * add a quick debugging sanity check
* It appears that Cray's SeaStar has some horrible performance for iovecs - IN_pLACE
  was actually slower than copying into eager frags.  Ugh.  And we don't even pre-pin
  eager frags yet!

This commit was SVN r9738.
2006-04-27 02:55:31 +00:00
George Bosilca
3e968d4f63 There is no length on the free list.
This commit was SVN r9704.
2006-04-24 23:13:51 +00:00
Brian Barrett
1da22f9099 * silence a bunch of compiler warnings on Solaris when using the Sun
compilers.

  This should go to the v1.1 branch

This commit was SVN r9693.
2006-04-23 21:15:09 +00:00