1
1
Граф коммитов

1903 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
3b34dc8df8 remove MCA_BTL_IB_FRAG_ALIGN. Alignment is handled in free_list_t.
This commit was SVN r10945.
2006-07-23 12:33:49 +00:00
Gleb Natapov
91f48f9a79 Merge with gleb-pml branch. Add out of resource handling support to PML layer.
If resource is not available request is added to one of the pending list and retried later.

This commit was SVN r10900.
2006-07-20 14:44:35 +00:00
Gleb Natapov
383694c68d Add support to get alignemnt buffers from free_list_t. Convert openib BTL to new interface.
This commit was SVN r10899.
2006-07-20 14:39:05 +00:00
Brian Barrett
4c101c6394 * rename the collectives sm bootstrap area to be consistent with other
shared memory segments
* make sure to properly unlink the collectives sm bootstrap area at
  shutdown
* Add missing / in the path for the mpool shared memory segment
* make sure to release the common_mmap structure in the SM btl
  after unlinking the file during shutdown

This commit was SVN r10886.
2006-07-19 20:55:29 +00:00
George Bosilca
21c542f0a5 Make the SM BTL FT friendly. Now there are 3 FT friendly BTLs: TCP, SM
and self.

This commit was SVN r10780.
2006-07-13 07:42:18 +00:00
George Bosilca
d00e6e29e8 Create a close function for the mpool SM module, in order to allow the cleanup. The
mca_common_sm_mmap file was left over by the SM mpool, and there was nobody able
to unmap and unlink it.

This commit was SVN r10770.
2006-07-12 22:12:07 +00:00
George Bosilca
fd39203262 As the self proc is marked as local, there will always be at least one local
proc. Don't create the SM file until we really know there is someone lse on
the same node.

This commit was SVN r10740.
2006-07-11 17:05:13 +00:00
George Bosilca
14b3f141db Nothing relevant !!!
This commit was SVN r10711.
2006-07-11 00:30:26 +00:00
Andrew Friedley
b7e0484c37 Give up on dat_ep_query() and instead manually send our address information across the wire after connection establishment.
I've introduced a race condition - seeing occasional LOCAL_LENGTH errors on the receive side.  I think I'm mixing up eager/max somehow - will look at it more on monday.

This commit was SVN r10690.
2006-07-07 21:48:16 +00:00
Gleb Natapov
e05ec69dc4 print "flush error" only once.
This commit was SVN r10672.
2006-07-06 08:03:01 +00:00
Gleb Natapov
9b0807e547 Put pending fragment on the right waiting list.
This commit was SVN r10671.
2006-07-06 07:51:23 +00:00
Brian Barrett
4ee4acb6a6 * ignore some Cray-only code when not on the Cray machine
This commit was SVN r10660.
2006-07-05 17:16:27 +00:00
Brian Barrett
043153dad3 * fix opal_list_item_t -> ompi_free_list_item_t type change
This commit was SVN r10659.
2006-07-05 17:02:16 +00:00
Brian Barrett
47725c9b02 * Add new PML (CM) and network drivers (MTL) for high speed
interconnects that provide matching logic in the library.
  Currently includes support for MX and some support for
  Portals
* Fix overuse of proc_pml pointer on the ompi_proc structuer, 
  splitting into proc_pml for pml data and proc_bml for
  the BML endpoint data
* bug fixes in bsend init code, which wasn't being used by
  the OB1 or DR PMLs...

This commit was SVN r10642.
2006-07-04 01:20:20 +00:00
Galen Shipman
7e079d20ab fix for stupid casting.. addresses issue on PPC64 where sizes get set
improperly and badness ensues..

This commit was SVN r10574.
2006-06-29 21:58:50 +00:00
George Bosilca
7d59a6885b Remove all references to the MRU list. Add back the repost list checks. For some reasons
it decrease the latency by around 0.3 micro-seconds ...

This commit was SVN r10571.
2006-06-29 19:25:44 +00:00
George Bosilca
78f0de127d Typo.
This commit was SVN r10567.
2006-06-29 15:16:25 +00:00
George Bosilca
238147f576 Help the compiler to optimize the code. Now the order in the enum reflect the
order we use them in the switch.

This commit was SVN r10565.
2006-06-29 15:10:58 +00:00
George Bosilca
9bf281bca2 Remove the gm_mru_reg list as it is never used. Cleanup the repost logic. Now we repost
a receive fragment only when we're done with the message from inside and we try to add it
to the list.

This commit was SVN r10564.
2006-06-29 15:10:11 +00:00
George Bosilca
43b7b17033 Release the memory registration when the descriptors get freed.
This commit was SVN r10540.
2006-06-28 15:24:16 +00:00
George Bosilca
d9daa34a6c Set the registration field to NULL when we create a new fragment.
This commit was SVN r10539.
2006-06-28 15:23:36 +00:00
Gleb Natapov
c8f75c472a remove modulo op from fast path. Improvement 0.02-0.04ms.
This commit was SVN r10538.
2006-06-28 12:00:47 +00:00
Gleb Natapov
e58a89ef3e OMPI_ENABLE_DEBUG is always defined (to 0 or 1). Use #if and nto #ifdef.
This commit was SVN r10537.
2006-06-28 11:25:09 +00:00
Gleb Natapov
704a5eb645 Support for LMC (lid mask count) and multiple QPs per port.
This commit was SVN r10536.
2006-06-28 07:23:08 +00:00
Galen Shipman
e6cd8db0e5 DR will now checksum on a per btl basis (see MCA_BTL_FLAGS_NEED_CSUM). We
still always send ACK's, teasing apart completion for ACK/no ACK looks like a
pain in the .. 

This commit was SVN r10530.
2006-06-27 20:23:47 +00:00
Jeff Squyres
df45221a3e Until a real fix for #142 is found, this workaround prohibits using
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1.  If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).

This commit was SVN r10521.
2006-06-27 10:43:03 +00:00
Gleb Natapov
52208d7bf9 Whe don't need to register zero sized frags.
This commit was SVN r10519.
2006-06-27 08:50:12 +00:00
Galen Shipman
8855e5b73a Fixes for DR as well as better diagnostic..
Successfully passing the intel test suite with/without induced errors/drops. 

This commit was SVN r10518.
2006-06-26 22:29:29 +00:00
Gleb Natapov
b7715395cb Return descriptor before sending credits one more time. We may need it.
This commit was SVN r10495.
2006-06-26 07:05:58 +00:00
Andrew Friedley
7bfac82ce7 Change over from lazy connection setup to setting up at initialization
time.

UD is connectionless, and as long as peers are statically assigned to QPs,
there is no reason to set up the adressing information lazily.

Lots of code was axed, as endpoints no longer have state.  Removed a
number of other elements in the endpoint struct to make it as lightweight
as possible.

I was able to remove an entire function call/branch in the send path,
which I believe is the main contributor to a 2us drop in NetPIPE latency.

Some whitespace cleanups as well.

Passes IBM test suite, and all but certain intel tests that were failing
before the change, over ob1 PML.

This commit was SVN r10494.
2006-06-23 16:50:50 +00:00
Andrew Friedley
046f4cd4ae Enough cleanup for now.
Moved a lot of the module-specific init from the component init to the module init.

Try keeping a pointer to reduce indexing, didn't seem to help - leaving in place
for now.

This commit was SVN r10485.
2006-06-22 22:12:13 +00:00
Andrew Friedley
8392ed4cac A checkpoint before I really do some cleanup.. nothing pretty here.
Playing around with OPAL_LIKELY/UNLIKELY, no real gains yet.

Reworked progress() to process many WC's at a time, as well
as immediately repost groups of receive buffers.

This commit was SVN r10481.
2006-06-22 18:06:55 +00:00
Andrew Friedley
365c81d6e9 Fix a few issues reported by Terry Dontje:
1. ompi/mca/btl/udapl/btl_udapl_proc.c should be including
btl_udapl_endpoint.h for mca_btl_udapl_proc_insert function.

2. btl_udapl_endpoint.c it looks like you are using
&endpoint->endpoint_lock when you should use &ep->endpoint_lock in a
OPAL_THREAD_LOCK call.

3. btl_udapl_frag.h has a couple opal_list_item_t's that should be
ompi_free_list_item_t in the _FRAG_ALLOC_{EAGER,MAX} macros.

This commit was SVN r10442.
2006-06-20 17:13:44 +00:00
George Bosilca
044868df45 Set the destination descriptor before calling the recv registration. Once
this call is completed, we have to remove it in order to be able to cleanup
correctly the fragments.

This commit was SVN r10428.
2006-06-20 14:11:09 +00:00
George Bosilca
1b18b7d934 Change the parameter registration of this BTL to the new calls (new is relative
here). Change the self BTL to use RDMA protocol.

This commit was SVN r10427.
2006-06-20 14:09:58 +00:00
Jeff Squyres
1d27ca5d0a Until a real fix for #142 is found, this workaround prohibits using
mpi_leave_pinned when multiple OpenIB HCA ports are found.
Specifically, if mpi_leave_pinned == 1 and ultiple HCA ports are
found, the MCA parameter btl_openib_max_btls is set to 1.  If the MCA
parameter btl_openib_warn_leave_pinned_multi_port is true, emit a
warning that this happened (having an MCA parameter to control the
warning allows users/sysadmins to turn it off instead of being nagged
for every run).

This commit was SVN r10424.
2006-06-20 11:32:46 +00:00
Jeff Squyres
600bf4295a Update the help message to be slightly more concise and clear
This commit was SVN r10422.
2006-06-20 11:23:38 +00:00
Brian Barrett
3d027e57a8 * fix for ticket #141. If we are going to shortcut out of polling the
send/receive queues if there is something available in the short message
  rdma queues, then we have to poll *ALL* the rdma queues before exiting,
  or we aren't fair about frag reception and fall into degenerate matching
  cases.

This commit was SVN r10410.
2006-06-17 21:32:25 +00:00
Brian Barrett
05046e8ad2 if MX isn't running on some hosts, but is on others, we were blocking in the modex receive
waiting for the non-running procs to publish their contact information.  Publish their
(lack of) contact information.

This commit was SVN r10355.
2006-06-14 19:07:38 +00:00
George Bosilca
aca71521db Complete the move of the mpool registration from opal_list_item_t to the
ompi_free_list_item_t.

This commit was SVN r10354.
2006-06-14 17:43:50 +00:00
Brian Barrett
d367dc5d56 * Fix for bug #115 -- we need to decrement the use count on a pinned buffer
so that memory is actually deregistered.  Reviewed by Galen.

This commit was SVN r10349.
2006-06-14 13:38:24 +00:00
Andrew Friedley
c68c6ac122 A number of fixes and the usual cleanup..
- Added some basic flow control to limit number of posted sends.
- Merged endpoint send/recv lock into single endpoint lock.
- Set the LMR triplet length in the send path, not at allocation time.
  This has to be done because upper layers might send less than the
  amount allocated.
- Alter the tie-breaker if statement protecting the second call
  to dat_ep_connect().  The logic was reversed compared to the tie-
  breaker for the first dat_ep_connect(), making it possible for
  3 or more processes to form a deadlock loop.
- Some asserts were added for debugging purposes.. leaving them
  in place for now.

This commit was SVN r10317.
2006-06-12 22:42:01 +00:00
Galen Shipman
218a438509 finished the ompi_free_list_t class nightmare..
This commit was SVN r10314.
2006-06-12 22:09:03 +00:00
Brian Barrett
d5acb4e3cc * silence dumb (and mostly useless) warning during cleanup
This commit was SVN r10280.
2006-06-09 21:09:53 +00:00
Jeff Squyres
a4030ad2d9 Improve the tremendously unhelpful MCA help message for the
btl_openib_ib_mtu and btl_mvapi_ib_mtu MCA params by showing the valid
values what what they represent (got a question about this from Cisco
testing engineers).

This commit was SVN r10277.
2006-06-09 18:02:45 +00:00
Andrew Friedley
75176370ae blah. somehow missed adding .ompi_ignore/.ompi_unignore.
This commit was SVN r10272.
2006-06-09 00:15:36 +00:00
Andrew Friedley
cca1616368 Finally committing the UD BTL.
UD is the Unreliable Datagram transport for Infiniband, specifically OpenIB.  This BTL is derived from the existing openib BTL, which is RC (Reliable Connection) based.

Still a work in progress, as there is a lot of work left to do.  Specifically, performance, scalability, and flow control need to be addressed.

Currently I'm playing around with different methods for handling receive buffers, as well as profiling to figure out where the time is going.

This commit was SVN r10271.
2006-06-09 00:13:45 +00:00
Galen Shipman
90799f82cd copy paste error..
This commit was SVN r10220.
2006-06-06 02:38:29 +00:00
Galen Shipman
cc54b07aa0 add better error messages for vapi retry exceeded errors.
This commit was SVN r10219.
2006-06-06 02:04:56 +00:00
Galen Shipman
9e6e7575b9 doh... add the file..
This commit was SVN r10210.
2006-06-05 21:24:42 +00:00
Galen Shipman
f05dee0435 add help file to explain why things went south..
This commit was SVN r10209.
2006-06-05 21:23:45 +00:00
Galen Shipman
74c97fb784 cleanup error reporting.. use ompi_proc_t->proc_name if available this gives
us source/dest hostnames for communication errors.. 

This goes to 1.1 branch (reviewed by Brian).. 

This commit was SVN r10200.
2006-06-05 20:02:41 +00:00
Galen Shipman
0344ae4ac5 Fix to allow eager limit and max send size to be any size (within resource limitations). Instead of storing the ompi_free_list_t * in the fragment, we use the frag type enum, this tells us where the frag came from and where it should return.. This could also be done in mvapi but is not a high priority moving forward..
Review by Brian, needs to hit the trunk + 1.1 release.. 

This commit was SVN r10157.
2006-06-01 02:32:18 +00:00
Brian Barrett
5163f2b296 Fix for bug #36. The MX, MVAPI, and OpenIB components don't have
support for progress threads, so we shouldn't build them or try to use
them when support for progress threads has been requested.  The TCP, GM,
SELF, and SM BTLs should have progress thread support, so they aren't
disabled.  The Portals BTL isn't compiled on platforms with threads,
so it doens't need to be updated.

This commit was SVN r10156.
2006-06-01 01:30:16 +00:00
Galen Shipman
c79efc9efb track which list a fragment came from, allows returning based on list, not
on size. 

This commit was SVN r10142.
2006-05-31 14:24:32 +00:00
Brian Barrett
c723d196c5 Rather than using fragment size to determine fragment type, use an enum.
Do this rather than the my_list pointer because we need to do some
things that are somewhat special because we pre-pin eager fragments but
not send fragments.  Also makes a couple ideas I have slightly easier to
play around with.

This commit was SVN r10127.
2006-05-31 03:34:32 +00:00
Galen Shipman
2667c52a5d Track fragments by list, not by size..
-- reviewed by Brian, needs to hit all the branches.. 

This commit was SVN r10078.
2006-05-25 18:07:26 +00:00
Galen Shipman
38a0561d9b Allow maximum send size to be less than the eager limit.
Instead of figuring out which free list the fragment belongs to based on size
we simply store a pointer to the list which it belongs in the fragment.

This was reviewed by Brian and should hit all the branches.

This commit was SVN r10072.
2006-05-25 16:57:14 +00:00
Andrew Friedley
8a3d0862ca I can commit! *happy dance*
Trying to remember what I did here.. eager/max messages should work now, no RDMA yet.  A number of other fixes and cleanups.

I do know of two problems:
 Bad stuff happens when flooded with send frags too quickly - the BTL doesn't handle flow control.
 Certain IBM tests turn up a length assertion in the datatype engine - needs more investigation.

This commit was SVN r10070.
2006-05-25 15:47:59 +00:00
Gleb Natapov
f590d8a190 fix eager RDMA on PPC64.
This commit was SVN r10059.
2006-05-25 11:05:12 +00:00
Jeff Squyres
dd44d36be0 Fix for ticket #25. Ensure that in the threaded case where we have
This commit was SVN r10043.
2006-05-24 16:15:07 +00:00
George Bosilca
085cac552f Don't let TCP to create local connections, we have the self BTL for this purpose.
This commit was SVN r10018.
2006-05-23 03:06:32 +00:00
Jeff Squyres
7b59847765 Ensure that endpoint->endpoint_addr is not NULL before trying to
derefence through it.  It is legal for endpoint_addr to be NULL in the
destructor because if btl_tcp_add_procs() -> btl_tcp_proc_insert()
returns UNREACH, then endpoint_addr will be NULL and we'll OBJ_RELEASE
it.

This commit was SVN r9940.
2006-05-16 19:01:08 +00:00
Jeff Squyres
e24377a89c Back out a pair of commits from George from last week because they
apparently don't work properly: r9869, r9868 (sm btl alignment issues)

This commit was SVN r9936.

The following SVN revision numbers were found above:
  r9868 --> open-mpi/ompi@9b985c3216
  r9869 --> open-mpi/ompi@adedf511fb
2006-05-16 16:48:43 +00:00
Brian Barrett
dcc6b47fa2 * put rdma operations in the send event queue instead of receive because it's
easier to do event accounting that way
* greatly increase receive event and buffer sizes.  We're still about half
  of what Cray defaults to, so I don't feel bad about the increases
* Implement a pre-pinning optimization for eager fragments - will be
  pinned on first use and left pinned for the life of the fragment
* Since we can't have two receive frag callbacks fired at the same time,
  don't have receive free list - just keep one receive fragment in the
  module.  Saves a big free list and all that interaction.

This commit was SVN r9915.
2006-05-14 04:23:26 +00:00
Andrew Friedley
4c3aa05c83 uDAPL has an expects memory for enumerating interface adapters in a really
weird way - fix up to do things 'properly'.

Add my sandia username to the unignore.

This commit was SVN r9879.
2006-05-10 19:50:30 +00:00
George Bosilca
adedf511fb Remove the printf that I unfortunately commit.
This commit was SVN r9869.
2006-05-10 00:02:54 +00:00
George Bosilca
9b985c3216 Force the useful data to be aligned on special boundary. It is 32 bits
right now. Some testing on large NUMA machines should be done in order
to make sure that we need to export this variable out to the MCA layer.

This commit was SVN r9868.
2006-05-09 21:46:10 +00:00
George Bosilca
a386fccccc Increase the default limits for the SM BTL. These new
values allow better performances on all the clusters
I was able to test.

This commit was SVN r9867.
2006-05-09 21:44:24 +00:00
Gleb Natapov
0c34d5c9e6 fix endpoint matching in on demand connection establishment. This fix is in mvapi btl already.
This commit was SVN r9855.
2006-05-09 12:12:52 +00:00
Tim Woodall
350d5b1713 change hardcoded values into mca params
This commit was SVN r9815.
2006-05-04 15:20:18 +00:00
George Bosilca
bdecdc8d41 Cleanup the MX BTL. Remove all mpool related code as there will never be a MX mpool.
This commit was SVN r9808.
2006-05-04 06:55:45 +00:00
Tim Woodall
4fd2a71b6c removed debug code - free list implementation has changed
This commit was SVN r9750.
2006-04-27 15:34:12 +00:00
Brian Barrett
9cab1bb54a * re-enable the eager fragment throttling, this time with the proper threshold value for when
the memory descriptor is closing itself, so that it actually works properly ;).  I think I
  was just getting lucky and not sending enough short messages with the reference impl.

This commit was SVN r9748.
2006-04-27 14:13:52 +00:00
Brian Barrett
66d1d3b83f * add a quick debugging sanity check
* It appears that Cray's SeaStar has some horrible performance for iovecs - IN_pLACE
  was actually slower than copying into eager frags.  Ugh.  And we don't even pre-pin
  eager frags yet!

This commit was SVN r9738.
2006-04-27 02:55:31 +00:00
George Bosilca
3e968d4f63 There is no length on the free list.
This commit was SVN r9704.
2006-04-24 23:13:51 +00:00
Brian Barrett
9a65ddd788 * back out r9005, which for some reason works fine on the reference implementation
but causes resource exhaustion on the Red Storm implementation.  Sigh...

This commit was SVN r9686.

The following SVN revision numbers were found above:
  r9005 --> open-mpi/ompi@20d06e889e
2006-04-22 20:12:33 +00:00
Andrew Friedley
345551cb36 Checkpoint before starting work on max-sized frags (maybe user too?).
- Some initial work on prepare_src
- Move some fragment initialization around
- Fix a union casting issue on picky compilers, identified by Don Kerr
- Other small cleanups/bugfixes

This commit was SVN r9662.
2006-04-19 22:20:22 +00:00
George Bosilca
61bea41350 The same in MX (missing copyright).
This commit was SVN r9661.
2006-04-19 21:37:30 +00:00
George Bosilca
afe9821d84 Add a missing copyright.
This commit was SVN r9660.
2006-04-19 21:36:22 +00:00
Tim Woodall
10f343734f decrease eager limit to 12K (improves latency)
This commit was SVN r9646.
2006-04-14 22:28:37 +00:00
Tim Woodall
6523c12e4b - decrease eager limit to 12K (improves latency)
- trigger event library while setting up connections

This commit was SVN r9645.
2006-04-14 22:28:05 +00:00
Tim Woodall
c6489cb5aa - turn on eager rdma by default
This commit was SVN r9641.
2006-04-14 21:11:14 +00:00
George Bosilca
b3cc3d82d3 Activate the OOB while we setup connections for MVAPI. Same thing should be done for the
Open IB ...

This commit was SVN r9640.
2006-04-14 20:53:42 +00:00
Gleb Natapov
98282a3567 fix spelling. threashold -> threshold.
This commit was SVN r9577.
2006-04-08 08:13:37 +00:00
Andrew Friedley
d461b55696 - Implement OOB connection handshaking via the ORTE RML. To start a connect,
we send our local addr_t OOB.  Remote side then matches endpoints and calls
  dat_ep_connect().  Everything should be the same as before from here, except
  that client/server roles are reversed.
- Properly set our buffer size when posting receives.  When the frag used to
  transfer address information is recycled by the free list, the wrong buffer
  size was being used, which caused buffer overflow errors.
- Finally put the uDAPL error handling stuff in the mpool component.
- Remove a few more OPAL_OUTPUTs.

This commit was SVN r9569.
2006-04-07 15:26:05 +00:00
Gleb Natapov
b6ab1f4262 fix compilation warnings.
This commit was SVN r9515.
2006-04-02 11:32:25 +00:00
Andrew Friedley
74b2f77a4c The expected cleanup/refactoring commit..
Not much got tested that wasn't already - I've uncovered a connection
establishment deadlock and wanted to get these changes committed before I
attack it.

The big changes:
 - Moved much of the connection code from btl_udapl_component.c to
   btl_udapl_endpoint.c.
 - Cleaned up initialization of various fragment members.
 - MCA_BTL_UDAPL_ERROR macro, which is compiled in/out appropriately.

This commit was SVN r9496.
2006-03-31 16:25:19 +00:00
Gleb Natapov
256bf70530 Forgot to add file to previous commit
This commit was SVN r9480.
2006-03-30 17:37:52 +00:00
Gleb Natapov
79bcfb096f Add type to frag. Sometimes we need to know that a frag is from short rdma area.
I used hack for this that doesn't work for mvapi, so changing it to something more sane.

This commit was SVN r9477.
2006-03-30 15:26:21 +00:00
Gleb Natapov
ea11582191 Porting of short message RDMA from openib BTL. Endpoint registers circular buffer and sends its address and rkey to the peer. Peer uses this buffer to eagerly RDMA small message into it. Endpoint polls the buffer for message arrival before checking HP/LP QPs. Set btl_mvapi_use_eager_rdma to 1 to enable it.
This commit was SVN r9474.
2006-03-30 12:55:31 +00:00
Andrew Friedley
0eba366b07 Various pieces all over to make basic small message send/recv work. Next step
is clean up the code.. it is in need of refactoring and testing.

Thanks to Brian for help in troubleshooting!

This commit was SVN r9466.
2006-03-29 21:55:41 +00:00
Gleb Natapov
590c992a7e fix recursive lock of openib_btl->ib_lock.
This commit was SVN r9427.
2006-03-26 15:02:43 +00:00
Gleb Natapov
01a119c3c5 fix compilation bug with --enable-mpi-threads
This commit was SVN r9426.
2006-03-26 13:24:10 +00:00
Gleb Natapov
a5a78b10cc Implementation of short message RDMA. Endpoint registers circular buffer and sends its address and rkey to the peer. Peer uses this buffer to eagerly RDMA small message into it. Endpoint polls the buffer for message arrival before checking HP/LP QPs. Set btl_openib_use_eager_rdma to 1 to enable it.
This commit was SVN r9425.
2006-03-26 08:30:50 +00:00
Andrew Friedley
48d61cd99a Mostly fragment/LMR handling fixes:
- Grab the mpool_registration in _frag_common_constructor()
 - Save the LMR context in the segment key
 - No need for cookie variables - can just cast the frag
 - No need to memcpy() data when recv'ing
 - Add an LMR triplet to the fragment structure and initialize it
   in btl_udapl_alloc().
 - Whitespace/typo fixes, remove some opal_output() calls

Looks like I can use triplets describing sub-regions of registered LMR's.  So I
do this - prior to this patch I was sending the entire free list memory over,
which isn't correct :)

Back to an earlier problem - when sending address information right after
connection establishment, the receiving end receives a DTO completion event and
appears to have good data.  But the sending end never receives a DTO completion
event indicating the send completed, and never completes the client side of the
connection.

This commit was SVN r9386.
2006-03-23 16:21:08 +00:00
Tim Woodall
c7ee5e13bc simplification - dont swap src/dst pointers - always leave both
src/dst pointing to same segments

This commit was SVN r9357.
2006-03-21 18:20:17 +00:00
George Bosilca
f7a5a582c5 Diagnostic function for mvapi. It print all the credits used for the flow control.
This commit was SVN r9355.
2006-03-21 17:02:14 +00:00
Andrew Friedley
cf9246f7b9 Long overdue commit.. many changes.
In short, I'm very close to having connection establishment and eager send/recv working.

Part of the connection process involves sending address information from the
client to server.  For some reason, I am never receiving an event indicating
completetion of the send on the client side.  Otherwise, connection
establishment is working and eager send/recv should be trivial from here.


Some more detailed changes:
 - Send partially implemented, just handles starting up new connections.
 - Several support functions implemented for establishing connection.  Client
   side code went in btl_udapl_endpoint.c, server side in btl_udapl_component.c
 - Frags list and send/recv locks added to the endpoint structure.
 - BTL sets up a public service point, which listens for new connections.
   Steps over ports that are already bound, iterating through a range of ports.
 - Remove any traces of recv frags, don't think I need them after all.
 - Pieces of component_progress() implemented for connection establishment.
 - Frags have two new types for connection establishment - CONN_SEND and
   CONN_RECV.
 - Many other minor cleanups not affecting functionality

This commit was SVN r9345.
2006-03-21 00:12:55 +00:00
George Bosilca
e181153f16 Remove the bogus prototype.
This commit was SVN r9333.
2006-03-19 19:22:35 +00:00
George Bosilca
a0d25ab6ef Add missing prototype for the mvapi diagnostic function.
This commit was SVN r9331.
2006-03-18 19:38:56 +00:00
Tim Woodall
bd870519fd - modified convertor copy_and_prepare routines to accept an addition
flag, new flags to be included when convertor is initialized
- modified pml/btl module defs and added stub functions for diagnostic
  output routines to dump state of queues / endpoints
- updates to data reliability pml

This commit was SVN r9329.
2006-03-17 18:46:48 +00:00
Tim Woodall
712468dbef add diagnostic interface
This commit was SVN r9328.
2006-03-17 17:39:41 +00:00
Tim Woodall
c34f4c2cb7 correct cleanup for threaded case
This commit was SVN r9291.
2006-03-16 00:05:39 +00:00
Galen Shipman
440417e92c Add max_btls option
This commit was SVN r9263.
2006-03-13 17:03:21 +00:00
Sven Stork
12b94972e2 Fix comment of a paramter.
This commit was SVN r9261.
2006-03-13 09:11:46 +00:00
Brian Barrett
d041558f85 * protect lock when not building threaded
This commit was SVN r9254.
2006-03-11 03:20:50 +00:00
Brian Barrett
3e2c51dea8 * fix some silly commenting done by a previous developer that are good for
a laugh but probably not good for usability ;)

This commit was SVN r9253.
2006-03-11 03:09:24 +00:00
Tim Woodall
9ae910044b resolve threading issue
This commit was SVN r9234.
2006-03-09 17:59:05 +00:00
Tim Woodall
8bf6ed7a36 - corrected locking in gm btl - gm api is not thread safe
- initial support for gm progress thread
- corrected threading issue in pml
- added polling progress for a configurable number of cycles to wait for threaded case

This commit was SVN r9188.
2006-03-02 00:39:07 +00:00
Brian Barrett
579e74290f * make gm wire-up endian safe
This commit was SVN r9179.
2006-02-28 02:03:46 +00:00
Brian Barrett
bfd49d248b * (hopefully) fix MPI_BOTTOM for portals, same way as oll the other RDMA btls from
eons ago...

This commit was SVN r9172.
2006-02-27 17:07:24 +00:00
Brian Barrett
9b19e3fef0 * remove some debugging output that shouldn't have been committed. Doh!
This commit was SVN r9171.
2006-02-27 16:23:52 +00:00
Brian Barrett
285581dff2 More endian-related cleanups:
- moved hton64 and ntoh64 from the bunch of places it had been copied
    into one header file
  - properly set and use the btl_tcp's nbo option to put things in
    network byte order on the wire if both sides don't have the same
    endianness
  - Put the OB1 PML's headers (with a couple exceptions I need to discuss
    with Tim) in network byte order on the wire if both sides don't have
    the same endianness
  - since it was needed for the TCP BTL, move the orte_process_name_t
    HTON and NTOH macros from the TCP OOB to ns_types.h

This commit was SVN r9145.
2006-02-26 00:45:54 +00:00
Jeff Squyres
628125599d Fix the TCL btl module endpoint matching during setup for the scenario
when running an MPI job spanning a node that has two TCP NICs and a
node that has one TCP NIC.  Previously, for the 2 NIC/module process,
we would return the first peer IP address if we couldn't find a subnet
match with any of the peer's published IP addresses -- this was to
support running OMPI across subnet boundaries.  Changed the behavior
to only do that behavior if the IP address we're trying to match is
public (i.e., not 10.x.y.z, 192.168.x.y, or 172.16.x.y) *and* any of
the remote peer's addresses are public (working on the assumption that
if we both have public addresses, they're routable to each other).

This definitely will not work in all scenarios, such as when we go to
WAN kinds of executions, and will need to be revisited at that time.

This commit was SVN r9119.
2006-02-23 02:02:19 +00:00
Galen Shipman
e58b758031 standardize behavior of btl_alloc, if the size is larger than the max send
size, btl_alloc returns NULL. 

This commit was SVN r9114.
2006-02-22 17:37:59 +00:00
Brian Barrett
0d098c9d57 * after talking with galen, take into account we might truncate
This commit was SVN r9113.
2006-02-22 17:03:04 +00:00
Brian Barrett
765d2ffc29 * the self btl should set the segment size field on alloc like the other btls
* clean up duplicate free in long message accumulates that looks like it was
  a cut-n-paste error

This commit was SVN r9112.
2006-02-22 16:20:13 +00:00
Brian Barrett
08747dcaf8 * Throttle the number of incoming receives so that we don't overrun our receive
event queue and lose receive messages.

This commit was SVN r9006.
2006-02-13 15:59:54 +00:00
Brian Barrett
20d06e889e * revert out some of the attempts to better use the Portals 3.3.2-2 user-space
run-time support, as it appears to be doing bad things to memory.  Update
  the hack to get the local nid to match the recent TCP nal changes, and
  update the P3RT api useage

This commit was SVN r9005.
2006-02-13 15:41:00 +00:00
George Bosilca
ecc3e00362 Various cleanups.
This commit was SVN r9002.
2006-02-12 21:36:07 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Andrew Friedley
b37e18916f Many different things, the big ones:
- Start filling in the progress function, focusing on connection establishment.
 - Initialize udapl mpool and free lists
 - Create/destroy a protection zone with each IA
 - Misc organization as I learn how things work

This commit was SVN r8969.
2006-02-10 21:49:15 +00:00
George Bosilca
9f1357fb89 Remove all the useless includes. Most of the endpoint do not depend on the
orte includes.

This commit was SVN r8932.
2006-02-08 05:10:48 +00:00
Galen Shipman
c8045bf397 Fixup for ORTE datatype checkin,
- use appropriate header files 
- change calls from orte_dps to orte_dss 

This commit was SVN r8920.
2006-02-07 15:20:44 +00:00
George Bosilca
20fd358327 A BTL cannot depend on the orte data-types.
This commit was SVN r8915.
2006-02-07 06:05:36 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Rainer Keller
7ac0ffc349 - Instead of doing the unlock inside the if, just move the if-statement
later after the mandatory unlock.

This commit was SVN r8885.
2006-02-02 17:32:22 +00:00
Tim Woodall
a2fde48f2f changes from release branch
This commit was SVN r8858.
2006-01-31 16:17:18 +00:00
Tim Woodall
bcd6c525f8 removed duplicate locks
This commit was SVN r8857.
2006-01-31 16:12:37 +00:00
Tim Woodall
9d484916db remove locks already held
This commit was SVN r8853.
2006-01-31 14:23:08 +00:00
Brian Barrett
b1d2424013 Merge in present work on the MPI-2 onesided chapter. The current code is not
complete, but stable enough that it will have no impact on general development,
so into the trunk it goes.  Changes in this commit include:

 - Remove the --with option for disabling MPI-2 onesided support.  It
   complicated code, and has no real reason for existing
 - add a framework osc (OneSided Communication) for encapsulating
   all the MPI-2 onesided functionality
 - Modify the MPI interface functions for the MPI-2 onesided chapter
   to properly call the underlying framework and do the required
   error checking
 - Created an osc component pt2pt, which is layered over the BML/BTL
   for communication (although it also uses the PML for long message
   transfers).  Currently, all support functions, all communication
   functions (Put, Get, Accumulate), and the Fence synchronization
   function are implemented.  The PWSC active synchronization
   functions and Lock/Unlock passive synchronization functions are
   still not implemented

This commit was SVN r8836.
2006-01-28 15:38:37 +00:00
George Bosilca
0f1c6d79e8 Make the MVAPI BTL thread safe again. The problem was a double locking on the endpoint mutex.
It's still not very clean as we still lock the mvapi_btl mutex inside a critical section
protected by the endpoint mutex ...

This commit was SVN r8810.
2006-01-25 23:14:06 +00:00
Galen Shipman
ddc22d8c7e Use endpoint_lock, not ib_lock (copy paste error from openib btl)
This commit was SVN r8806.
2006-01-25 15:04:37 +00:00
Andrew Friedley
ec995160e6 Checkpoint for switch to mpool work:
- Remove printing of CFLAGS in configure.m4
 - Set MCA_BTL_FLAGS_SEND flag
 - Improved error handling during module initialization
 - Extract the address of each interface with dat_ia_query
 - Start playing around with fragment stuff - probably wrong
 - Misc code cleanup (removal of GM-specific code)

This commit was SVN r8801.
2006-01-25 02:21:34 +00:00
Tim Woodall
51ec050647 port of revised flow control from openib
This commit was SVN r8799.
2006-01-24 23:44:30 +00:00
Tim Woodall
e861158fcd - removed debug code
- removed extraneous memset

This commit was SVN r8798.
2006-01-24 23:38:41 +00:00
Galen Shipman
1e0ea9dd6d Major fixes for the RDMA registration cache (leave_pinned).
This commit fixes issues with HPL runs on node counts > 4. 

This commit was SVN r8793.
2006-01-23 22:51:50 +00:00
Brian Barrett
8b2a285f8f * update reference implementation support to match latest release
This commit was SVN r8783.
2006-01-22 04:56:18 +00:00
Galen Shipman
d657052510 misc cleanup..
This commit was SVN r8731.
2006-01-18 16:20:50 +00:00
Galen Shipman
84a09e4f4e use #if not #ifdef..
This commit was SVN r8720.
2006-01-17 21:07:34 +00:00
Galen Shipman
0c81c0a6ce use ibv_get_device_list if present
(submitted from roland)

This commit was SVN r8712.
2006-01-17 16:23:35 +00:00
Andrew Friedley
5ccab7bcda Checkpoint:
- Move mca_btl_udapl_error/mca_btl_module_init to mca_btl_udapl.c and rename it
 - White space cleanups
 - Free the uDAPL evd and ia handles in mca_btl_udapl_finalize

This commit was SVN r8705.
2006-01-16 21:54:50 +00:00
Andrew Friedley
a4abe3bdbe Checkpoint:
- Borrow configure.m4 from the mvapi btl.  One of the uDAPL headers emits a
   warning when -pedantic is enabled, so strip it out.
 - Change function check in ompi_check_dapl.m4 from dat_ia_open to
   dat_registry_list_providers.. dat_ia_open wasn't working right
 - Make the references to prepare_dst, put, and get NULL for now
 - Add opal_output() calls in all the udapl interface functions for debugging
 - Add evd_qlen component parameter to control event dispatcher queue length
 - First stab at component_init and module_init
 - Misc cleanups - whitespace, dead code removal
 - Update copyrights to 2006

This commit was SVN r8701.
2006-01-16 03:01:12 +00:00
George Bosilca
d4699037f7 Protect an assert if the endpoint cache is not activated.
This commit was SVN r8695.
2006-01-14 21:10:09 +00:00
George Bosilca
3317bf81ad A better implementation for the TCP endpoint cache + few comments.
This commit was SVN r8692.
2006-01-14 20:21:44 +00:00
Tim Woodall
a584c60dbe re-worked flow control logic to take into account the return
of credits from the peer prior to local completion, so that
we don't overrun the number of send wqes available.

This commit was SVN r8683.
2006-01-12 23:42:44 +00:00
Andrew Friedley
c0bad339af - Use the GM BTL as a template instead, per Tim's suggestion
- Begin adding uDAPL-specific stuff
- Added config/ompi_check_udapl.m4 - hopefully I did this right

This commit was SVN r8681.
2006-01-12 04:05:02 +00:00
Tim Woodall
63d0438991 merge in changes from release branch
This commit was SVN r8637.
2006-01-04 16:34:45 +00:00
George Bosilca
479d510eaf Use the common SM component to unmap the shared memory file.
This commit was SVN r8623.
2005-12-31 15:07:48 +00:00
George Bosilca
228f966798 A little trick to force qsort to order the sm modules in the order we expect.
On some systems (like windows) qsort can modify the order of modules when the
comparaison function return 0. As we expect to have the
mca_btl_sm_add_procs_same_base_addr called before mca_btl_sm_add_procs we have
force the qsort to return the SM modules in the correct order. Giving to the
same_addr module a slightly higher priority solve this problem.

This commit was SVN r8620.
2005-12-31 14:59:53 +00:00
Tim Woodall
4ff5316b2d correct copy/paste error
This commit was SVN r8601.
2005-12-22 18:07:46 +00:00
Tim Woodall
5d91c492d6 improve diagnostic error messages
This commit was SVN r8600.
2005-12-22 18:01:47 +00:00
Tim Woodall
e9498f7a75 improve error reporting when registrations fail
This commit was SVN r8598.
2005-12-22 16:05:28 +00:00
Andrew Friedley
f402854a96 Initial commit of uDAPL BTL component.
- Copied the template BTL and renamed everything
 - Compiles and shows up correctly in ompi_info, not tested past that
 - Should be ignored for everyone but me

This commit was SVN r8544.
2005-12-19 16:37:05 +00:00
Brian Barrett
a5af07cd6b fixes suggested by Ralf for supporting both Libtool 1 and 2 in Open MPI...
This commit was SVN r8538.
2005-12-19 03:10:23 +00:00
George Bosilca
1b667067d6 I need to know the number of iovec attached to the fragment.
This commit was SVN r8447.
2005-12-10 23:28:16 +00:00
George Bosilca
e5158142b9 The lb should be extracted from the datatype not from the convertor.
This commit was SVN r8446.
2005-12-10 23:27:20 +00:00
George Bosilca
01b0db91ae Get the lower-bound from the data not from the convertor.
This commit was SVN r8444.
2005-12-10 22:38:25 +00:00
George Bosilca
7baae4f394 Protect the headers and remove the unused ones.
This commit was SVN r8439.
2005-12-10 22:04:28 +00:00
Tim Woodall
1929a97d2f corrections for MPI_BOTTOM
This commit was SVN r8429.
2005-12-09 23:27:55 +00:00
George Bosilca
8888bfb063 And the thread-safe version. The lock/unlock macros are supposed to be
empty for non threaded builds, but somehow just by moving the code a
little bit around and removing 2 call to lock/unlock the latency for TCP
went down by 2 micro-seconds ...

This commit was SVN r8426.
2005-12-09 05:16:50 +00:00
Jeff Squyres
6fbd321442 Fix a bunch of install locations for header files
This commit was SVN r8406.
2005-12-08 00:54:44 +00:00
George Bosilca
5851b55647 Improve the latency for small and medium messages. The idea is to decrease the
number of recv system call by caching the data. Each endpoint has a buffer
(the size is an MCA parameter) that can be use as a cache. Before each receive
operation this buffer is added at the end of the iovec list. All data that are
not expected by the fragment will go in this cache. If the cache contain data
all subsequent receive will just memcpy the data into the BTL buffers.

The only drawback is that we will spin around the receive_handle until all the
cached data is readed by the PML layer. This limitation come from the fact that
the event library is unable to call us if there is no events on the socket.
Therefore we are unable to keep the data in the cache until the next loop
into the progress engine.

This commit was SVN r8398.
2005-12-07 00:12:59 +00:00
Brian Barrett
38391e3406 disable shared receive queue support at compile time if the mvapi implementation
does not support shared receive queues (such as the one shipped by SilverStorm / 
Infinicon for OS X).  Reviewed by Galen.

This commit was SVN r8389.
2005-12-06 15:46:30 +00:00
Tim Woodall
3f396aeae9 fix send to self for large messages
This commit was SVN r8379.
2005-12-05 23:36:33 +00:00
Tim Woodall
8c443832ae add a parameter to limit max number of btls (HCA ports used)
This commit was SVN r8342.
2005-11-30 22:18:21 +00:00
Tim Woodall
5db38b38f5 corrections for latency issue
- don't do additional select until non-blocking read fails 
- don't do an additional read for 0 byte message

This commit was SVN r8312.
2005-11-29 17:33:01 +00:00
George Bosilca
9d990af4a5 Remove 2 useless functions. They have been replaced by the mca_base version few commits ago.
This commit was SVN r8287.
2005-11-28 20:14:23 +00:00
Galen Shipman
55a9fbefd8 fix misc compiler warnings..
This commit was SVN r8263.
2005-11-27 22:53:30 +00:00
George Bosilca
bfa4a40983 Cast it the a well known type to remove a warning.
This commit was SVN r8261.
2005-11-26 21:17:15 +00:00
George Bosilca
b9a739e2b6 Remove 2 useless assignments (they are done at the end before the return).
This commit was SVN r8260.
2005-11-26 21:16:30 +00:00
George Bosilca
00c10a6372 Make the MX BTL startup scalable. When the number of processes involved in the MPI application
increase the previous connection code was broken. It can take as much as 60 seconds to connect
64 processes. Now we do not create the connections when we add the procs but only when we send
them the first message. Now it take only 1.6 seconds to setup a 64 procs MPI job over MX (doing a 2 steps barrier in order to insure that we create all the connections).

This commit was SVN r8252.
2005-11-23 23:48:56 +00:00
George Bosilca
7c73095440 Update the correct sended size.
This commit was SVN r8237.
2005-11-22 21:51:04 +00:00
Brian Barrett
20cea60b82 * fix "make distclean" error in PML
* turns out (duh!) that there was a reason that the <projectdir>dir
  variable was set in the AM conditional.  If not, stupid directories
  are created and not needed...  duh.

This commit was SVN r8205.
2005-11-20 07:41:09 +00:00
Brian Barrett
8faa1884f0 * The last of the build system optimizations. Combine the component and
component/base Makefile.am files, reducing the time configure spends
  stamping out Makefiles at the end
* Install base_impl.h file when devel-headers are being installed

This commit was SVN r8200.
2005-11-20 01:03:01 +00:00
Galen Shipman
eb3ccdb4d8 make compiler happy on false postive warning..
This commit was SVN r8192.
2005-11-18 18:48:11 +00:00
Galen Shipman
4fce90a37b one last warning fixed on 32 bit platforms.
This commit was SVN r8191.
2005-11-18 17:27:09 +00:00
Galen Shipman
635e7a682b fix for 32bit compile warnings.
This commit was SVN r8190.
2005-11-18 17:08:51 +00:00
George Bosilca
bba42f5e49 We are allowed to call mx_set_error_handler before any other MX functions, even before mx_init.
With the errors set to return mx_init will not force the application to exit if there is no MX kernel
module loaded.

This commit was SVN r8184.
2005-11-17 18:47:27 +00:00
Galen Shipman
dde38d4119 reset sg_entry->addr to point at header when sending control messages.
cast to uint64_t (the correct datatype per verbs.h) instead of uintptr_t. 

This commit was SVN r8175.
2005-11-17 05:45:33 +00:00
Tim Woodall
58dd6c2493 - merge from release branch
This commit was SVN r8174.
2005-11-17 05:32:30 +00:00
Tim Woodall
01b94862df merge from release branch
This commit was SVN r8168.
2005-11-16 17:12:44 +00:00
Tim Woodall
142b7cc682 merge from release branch
This commit was SVN r8167.
2005-11-16 17:10:49 +00:00
George Bosilca
7ad6b2b70e Add a MCA params to allow/disable the MX shared memory capabilities. Right now this param
is labeled as internal so the users will not see it but it is not read-only so we can still
play with it (that's for our internal tests). This is supposed to dissapear later after the
next (or next next) release of the MX library, but we need it now as a quick fix before the
release.

This commit was SVN r8161.
2005-11-15 20:54:45 +00:00
Tim Woodall
54b6acb2b4 merge from release branch
This commit was SVN r8149.
2005-11-13 23:31:20 +00:00
Jeff Squyres
e6a3a406e2 Remove debugging printf
This commit was SVN r8139.
2005-11-13 14:57:44 +00:00
Jeff Squyres
97b97f84b8 Next checkpoint in the sm btl fixes:
- Add big comment about a general overview of what the sm btl is doing
- random small code cleanups
- fix instances of mca_btl_sm[0] to mca_btl_sm[1] where relevant
- remove a lot of unused, confusing, and incorrect interface functions
  from ompi_fifo.h and ompi_circular_buffer.h.  These functions, if
  they were used, would not work properly with the scheme that the sm
  btl uses with the fifos (i.e., receiver makes right -- if necessary)
- add some missing offset computations in the fifo and circular buffers
- change the types of offsets to be ssize_t, not size_t
- remove an offset parameter from a function that didn't need it

This commit was SVN r8135.
2005-11-12 22:32:09 +00:00
Jeff Squyres
6444887373 - Add copyright headers to btl_sm_frag.h
- Ensure to convert base_shared_mem_flags to be a relative offset in
  the global storage, and then to convert that back to an absolute
  virtual address before we try to use it
- Don't double increment n_local_procs when calculating the peer rank
  during bootstrapping of the different base address case

Something else is still wrong; if mmap() returns a different base
address, things don't work (i.e., segv or hang forever when you try to
send a message).  More specifically, the bootstrapping now seems to
correctly handle the case when mmap() base addresses are different,
but the message passing does *not* -- it always assumes that the
mmap() base addresses are the same.

Still working on the fix for that -- want to checkpoint what has been
done so far to facilitate working on different machines...

This commit was SVN r8134.
2005-11-12 14:04:46 +00:00
Galen Shipman
5a4b1ebdd4 in mca_btl_openib_endpoint_post_send: set opcode on work request before potentially inserting it on pending list..
This commit was SVN r8127.
2005-11-12 02:11:14 +00:00
George Bosilca
e297b58fbd Add more MCA arguments.
Make some of them system (not seems by the user) and read-only.
Small cleanups.

This commit was SVN r8126.
2005-11-12 00:31:59 +00:00
Galen Shipman
5cf2d8d40c default to first available IP address if no matching subnets found..
This commit was SVN r8125.
2005-11-12 00:31:34 +00:00
Tim Woodall
654ba6d262 srq cleanup
This commit was SVN r8106.
2005-11-10 23:29:54 +00:00
Tim Woodall
2013104d1a SRQ cleanup
This commit was SVN r8104.
2005-11-10 20:51:56 +00:00
Tim Woodall
4a06e8463c port of flow control from mvapi
This commit was SVN r8102.
2005-11-10 20:15:02 +00:00
Tim Woodall
985c2ca943 cleanup
This commit was SVN r8093.
2005-11-10 15:40:27 +00:00
George Bosilca
8119c970db Improve the connection algorithm for MX. There are 2 problems here:
- first we setup the connections in the begining with all the peers
- MX does not handle well the case where several peers make connections to the same
  destination simultaneously.

So I change the order in which we connect. First we compute our rank in the array,
then in a round-robin fashion we setup connection starting with our left neighboard.

This commit was SVN r8075.
2005-11-10 01:15:49 +00:00
George Bosilca
dc1ad885d1 Move the output message outside the loop. We print an error message only once when we fail to
connect to a peer. Bonus, we print some additional informations like its MAC Address or name
if it's on our tables.

This commit was SVN r8074.
2005-11-10 01:13:18 +00:00
Tim Woodall
62fd74140b decrease socket buffers sizes to same as ptl code
This commit was SVN r8072.
2005-11-10 00:40:55 +00:00
Tim Woodall
b5ed723ea4 - check for null return
- disable debug

This commit was SVN r8070.
2005-11-10 00:02:18 +00:00
Galen Shipman
3079fc2da1 use correct lock for threaded build..
This commit was SVN r8055.
2005-11-09 16:09:05 +00:00
Tim Woodall
78522ed454 send credits on correct qp
This commit was SVN r8050.
2005-11-08 22:59:44 +00:00
Tim Woodall
b4ca28da4b removed debug
This commit was SVN r8046.
2005-11-08 21:41:02 +00:00
Tim Woodall
2d9c509add flow control
This commit was SVN r8039.
2005-11-08 16:50:07 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Tim Woodall
31eb35c3f1 correct rnr parameter - need to review this code and pass correct data type
This commit was SVN r7936.
2005-10-31 17:18:39 +00:00
George Bosilca
b0def3f6bf MX has 2 limitations regarding the iovecs. First they do not support iovec witha total size
larger than 32K for inter-nodes transfert ... and then they do not support iovecs larger than
16K for inter-node transfert. Therefore we have to set the size of our first fragment to
16K to match both cases.

This commit was SVN r7926.
2005-10-28 20:37:43 +00:00
Galen Shipman
4a15761732 add support for srq limit reached async event, even though it doesn't appear
to  be supported by mellanox vapi.. perhaps this will be supported in the near 
future, for now it doesn't hurt to have it in the trunk


Also cleanup the receive descriptor posting macro's.. 

This commit was SVN r7903.
2005-10-27 22:47:19 +00:00
Tim Woodall
3bd5b81dfa Submitted: Gleb Natapov
This commit was SVN r7899.
2005-10-27 17:48:40 +00:00
Tim Woodall
13409ec53b correction for hang, check for additional fragments before callback,
which may queue a new fragment

This commit was SVN r7889.
2005-10-27 01:39:39 +00:00
Galen Shipman
cb84a57c57 add endpoint and srq flow-control..
Note, we are failing the ring tests in the intel p2p test suite, but we seem
to fail the same tests under the current trunk.. will look into this further. 

This commit was SVN r7823.
2005-10-21 02:21:45 +00:00
Galen Shipman
0d1d231169 convert to new mca params, adding description strings.
changed mca param rr_buf_min/max to rd_min/max 
Add bandwidth param to openib 

This commit was SVN r7815.
2005-10-20 02:55:21 +00:00
Brian Barrett
de5e501519 Rather than hard spinning waiting for something to happen when doing shared
memory initialization, call opal_progress() to push any pending events
around and possibly yield the processor if nothing entertaining is happening.

This should probably go to the 1.0 branch.

This commit was SVN r7808.
2005-10-19 00:56:14 +00:00
Galen Shipman
4d2d39b0a6 intial checking of SRQ flow control support for mvapi
This commit was SVN r7796.
2005-10-18 14:55:11 +00:00
Galen Shipman
3efecaaeda convert openib btl to use new mca_param registration.. Also, change rr_buf_min
and rr_buf_max to rd_min and rd_max 

This commit was SVN r7786.
2005-10-17 20:00:34 +00:00
Tim Woodall
c944988b9e merge in changes from release branch - acquire/release send token for put/get
This commit was SVN r7784.
2005-10-17 18:59:28 +00:00
Brian Barrett
1302cb4072 The next in a long line of crazed build system changes from Brian. This was
originally suggested by Ralf Wildenhues, to try to speed autogen, configure,
and make (and possibly even make install).  Use automake's include directive
to drastically reduce the number of Makefile files (although the number of
Makefile.am files is the same - most are just included in a top-level
Makefile.am).  Also use an Automake SUBDIRs feature to eliminate the
dynamic-mca tree, which was no longer really needed.  This makes adding
a framework easier (since you don't have to remember the dynamic-mca
tree) and makes building faster (as make doesn't have to recurse through
the dynamic-mca tree)

This commit was SVN r7777.
2005-10-17 00:21:10 +00:00
Tim Woodall
d859855dea merge in changes from 1.0
This commit was SVN r7728.
2005-10-12 15:54:35 +00:00
Galen Shipman
23cbac25c8 lower default free list sizes..
This commit was SVN r7676.
2005-10-09 18:15:12 +00:00
Galen Shipman
fb19cc4177 compiler warning fixes..
This commit was SVN r7661.
2005-10-07 17:38:34 +00:00
George Bosilca
1fe18814da Decrease the default length for the first fragment.
This commit was SVN r7643.
2005-10-06 00:05:01 +00:00
George Bosilca
0f04132b13 mx_connect in the MX documentation is supposed to take a timeout in seconds. However, in real life it seems that the timeout should be in micro-second.
This commit was SVN r7642.
2005-10-06 00:04:27 +00:00
Tim Woodall
3b4a134a24 - removed unused define
- correct free to release registration rather than retain it

This commit was SVN r7611.
2005-10-04 14:33:26 +00:00
George Bosilca
6b3d02b514 Warning cleanups. On some OSes the iov_base member of the iovec structure is defined as an void * when
on others as an char*. Thus the right side of all assignment should be explicitly casted to an void* in
order to avoid any casting complaints from the compilers.

This commit was SVN r7607.
2005-10-04 12:36:07 +00:00
George Bosilca
3453a6c0e9 Remove some compiler warnings about unused variables
Correctly define the 64 bits constants.
Some minor cleanups.

This commit was SVN r7606.
2005-10-04 12:29:51 +00:00
George Bosilca
492c0e59dc Correct the casting type and remove some useless output (already commented out).
This commit was SVN r7605.
2005-10-04 12:28:47 +00:00
Galen Shipman
eefe0fd04a fix threaded compile
fix misc warnings 
cleanup posting of receive descriptors 
comment why we retain before deregister in rcache_rb_mru.c 

This commit was SVN r7595.
2005-10-03 16:35:12 +00:00
Galen Shipman
f46548e691 Add SRQ support to OpenIB btl, removed old mca param - not used..
This commit was SVN r7585.
2005-10-02 18:58:57 +00:00
Galen Shipman
67d38b7896 Add multi-nic support to openib
Fix connection establishment race in openib 
Other misc 

This commit was SVN r7570.
2005-09-30 22:58:09 +00:00
Brian Barrett
db872a0fbb * check that return from ibv_get_devices isn't NULL before calling dlist_start().
On thor, if IB is down, we get NULL back from ibv_get_devices(), which then
  caused segfaults in dlist_start().
* Pretty-print error message if no HCAs found

This commit was SVN r7557.
2005-09-30 14:58:59 +00:00
Jeff Squyres
fcef1774d5 Per advice from Ralf W., change the pkgdata declarations in
Makefile.am's to be a *slightly* more correct (and, more importantly,
less error-prone) construct.

This commit was SVN r7554.
2005-09-30 13:32:39 +00:00
Jeff Squyres
80b7deb4d7 Add in EXTRA_DIST to get helpfile in tarballs
This commit was SVN r7553.
2005-09-30 10:25:04 +00:00
Brian Barrett
7b20370306 * pretty-print an error message if a btl component loads but can't find
any NICs to use
* Make mvapi, gm, and mx components all publish information, even if there
  are no NICs available so that modex_recv doesn't hang.  If there are no
  NICs available, don't set the reachable bit, but don't do anything
  to fail.  This unfortunately doesn't cover the hangs that will result if
  different procs load different sets of components, but it's a start

This commit was SVN r7550.
2005-09-30 04:39:44 +00:00
Brian Barrett
a77c908496 * the last of the tuning params for portals
This commit was SVN r7548.
2005-09-30 04:05:31 +00:00
Galen Shipman
8239e635b9 fix misc warnings, cleanup macro..
This commit was SVN r7547.
2005-09-30 03:13:51 +00:00
Brian Barrett
997644af31 * There are now two forms of ibv_create_cq, one with 3 params and one with 5.
Try to detect which form this version of Open IB uses, defaulting to the 5
  version if we can't figure it out (the new version has 5 params)
* Only add -lcm if it exists on the system - some versions of Open IB
  apparently don't need it.

This commit was SVN r7542.
2005-09-29 13:35:57 +00:00
Galen Shipman
26a74d42fa release, not retain on gm_free
This commit was SVN r7535.
2005-09-28 20:18:52 +00:00
Galen Shipman
af04b3e1ab fix warnings..
This commit was SVN r7515.
2005-09-27 14:23:51 +00:00
Galen Shipman
3c97b3f722 Modified the registration to include a base_align and bound_align for
searching the tree. Modified the memory callback to search the tree at each
page boundary for registrations. This is necessary as an application may
malloc memory and send out of any portion of that memory, even discontiguous
regions. 

This commit was SVN r7510.
2005-09-27 02:01:21 +00:00
Brian Barrett
d9e80d8f2a * increase size of event queue for receives - it was too small to be useful
on a reasonably sized machine
* if no mpool exists, don't try to malloc out an array of 0 bytes

This commit was SVN r7507.
2005-09-25 17:04:03 +00:00
Galen Shipman
9fe5844071 decrement ref count on removal of registration from mru and tree.
add misc asserts to check for proper reference counting. 

ugly hack 1 -- use mallopt to never release memory ala sbrk - this is
commented out in mca_btl_mvapi_component_init

ugly hack 2 -- test registrations comming out of the tree via rcache_find, for
an unknown reason the tree is returning registrations where the address is not
within the base or bound of the registration. If this happens, we return
NULL. 

comment out code to enable mem hooks if leave_pinned is set, note we can do
this via an mca param and will default it to leave_pinned with mem_hooks when
we iron out these issues. 

I am adding a unit test for the rcache. Note that we have a unit test for the
rb tree but the compare function is significantly different than that used for
registrations. After we have tracked down the issues with rcache_rb we will
remove the above hacks. 

This commit was SVN r7499.
2005-09-24 00:24:49 +00:00
Brian Barrett
50dc5499b4 * fix some remaining --with-btl-portals configure issues
This commit was SVN r7498.
2005-09-24 00:11:40 +00:00
Brian Barrett
0d68728b94 * add some more debugging output for send fragment issue to figure out why
Red Storm is complaining about invalid memory pointer (need to go back
  to Linux and look at this with valgrind)
* Turn off send in place for now, so I can run the tests on RS and see if
  everything else is ok

This commit was SVN r7497.
2005-09-23 19:30:54 +00:00
Brian Barrett
07b0b8c943 * add some useful debugging output
* fix dumb bug in btl_portals_get where I using the dest descriptor key instead
  of the source descriptor key for the match bits, resulting in a PtlGet() with
  the wrong match bits

This commit was SVN r7496.
2005-09-23 15:30:18 +00:00
Tim Woodall
147716c249 added hostname to error output
This commit was SVN r7486.
2005-09-22 16:41:34 +00:00
Andrew Friedley
555ae37255 Add lib{opal,orte,mpi}.la to appropriate LIBADD's, some whitespace cleanup as well.
This commit was SVN r7477.
2005-09-22 12:28:54 +00:00
Tim Woodall
a74ca0062a reductions to initial memory footprint
This commit was SVN r7455.
2005-09-21 19:10:56 +00:00
Galen Shipman
4296e723c9 default free_lists to smaller size..
This commit was SVN r7454.
2005-09-21 18:55:07 +00:00
Galen Shipman
96ab5a6bd3 we can be in WAITING_ACK state without a race if the OOB ack is "slower" than
the scheduling of queued IB send operations. 

This commit was SVN r7452.
2005-09-21 16:47:08 +00:00
Tim Woodall
0ee34051f8 debug asserts
This commit was SVN r7449.
2005-09-21 15:30:17 +00:00
Tim Woodall
1b73d3856e possible race condition - set endpoint state before sending connect ack
This commit was SVN r7448.
2005-09-20 21:03:55 +00:00
Brian Barrett
d81726833e * Add memory barriers for shared memory. Rich and I think we got them
all and the Intel tests pass slightly oversubscribed.

This commit was SVN r7431.
2005-09-19 16:28:25 +00:00
Tim Woodall
aeb5bc3f57 still need to cleanup/revise the template for mpool changes
This commit was SVN r7425.
2005-09-19 14:34:24 +00:00
George Bosilca
b5cb27c006 The self should use self named files.
This commit was SVN r7421.
2005-09-18 12:37:15 +00:00
Galen Shipman
808b2c1c53 threaded build fix for btl_gm..
This commit was SVN r7409.
2005-09-16 17:18:15 +00:00
Tim Woodall
31d392af95 correct name
This commit was SVN r7376.
2005-09-14 22:35:58 +00:00
Tim Woodall
d190e6a315 handle losing a connection
This commit was SVN r7373.
2005-09-14 21:27:30 +00:00
Tim Woodall
c25fb5dab0 - fixed issue w/ btl send-in-place option that was affecting tcp
- reduced size of match header by an additional 4 bytes to 16 bytes
- corrections for buffered send (work in progress)

This commit was SVN r7371.
2005-09-14 17:08:08 +00:00
Brian Barrett
e98415eb7b * make tree compile on OS X
This commit was SVN r7370.
2005-09-14 15:52:42 +00:00
Galen Shipman
f0b1ea52bc if all else fails in prepare_src,, pack
init the rdma_pending list in ob1

This commit was SVN r7366.
2005-09-14 04:41:33 +00:00
Brian Barrett
1290b8eed2 * some debugging to figure out why get isn't working on RS
This commit was SVN r7354.
2005-09-13 20:52:56 +00:00
George Bosilca
ad0c0cdc03 Make the GM btl compile again. There were just some typos.
This commit was SVN r7352.
2005-09-13 20:19:21 +00:00
Jeff Squyres
bbae6c3b1a Add missing header file
This commit was SVN r7338.
2005-09-13 12:19:34 +00:00
Galen Shipman
39f25428da missing includes, perhaps related to george's work?
This commit was SVN r7332.
2005-09-13 02:00:28 +00:00
Galen Shipman
d932cfd342 merge of rcache work into the trunk.. lotsa fun ;-)..
I regression tested before the merge, I will regression test tonight and
correct issues that might have crept in. 

This commit was SVN r7329.
2005-09-12 22:28:23 +00:00
Brian Barrett
4c62c356c7 * more missing header file recovery
This commit was SVN r7328.
2005-09-12 22:13:09 +00:00
George Bosilca
8308ab42e9 GM depend on the proc.h now.
This commit was SVN r7327.
2005-09-12 21:52:44 +00:00
Brian Barrett
88cd561198 * bunch of fixes for Red Storm - missing header files and the like
This commit was SVN r7325.
2005-09-12 21:45:58 +00:00
Tim Woodall
304f6254e6 additional btl flags
This commit was SVN r7324.
2005-09-12 21:38:31 +00:00
Brian Barrett
79f7ea6856 * implement btl_put for Portals
This commit was SVN r7320.
2005-09-12 20:24:43 +00:00
George Bosilca
c9fb1f32f2 And more dependencies fixes. The big commit will follow shortly.
This commit was SVN r7319.
2005-09-12 20:22:59 +00:00
George Bosilca
1b031c153b Last commit to make the threaded case compiling without warnings. Next step try to make it working ...
Correct the spring of the vpid problem (similar to the one in the SM PTL).

Add one more argument to the MCA_BTL_SM_FIFO_WRITE macro who will get passed down to the 
MCA_BTL_SM_SIGNAL_PEER macro to allow it to have the fifo_fd file descriptor.

This commit was SVN r7305.
2005-09-11 20:55:22 +00:00
George Bosilca
f8d9f6121c Typo correction ...
This commit was SVN r7303.
2005-09-11 20:49:27 +00:00
George Bosilca
c24eb702bb Correctly compute the default sizes for the fragments.
This commit was SVN r7299.
2005-09-11 20:02:55 +00:00
Jeff Squyres
4aa75fa739 - Make opal_output_stream_t be a real opal_object_t so that it can use
a constructor, like the rest of the code base
- Convert usage in the tree to use the constructor to zero out an
  instance of opal_output_stream_t
- Still need to re-enable output files

This commit was SVN r7253.
2005-09-09 10:46:54 +00:00
Tim Woodall
59f2462ef0 corrections/clarifications
This commit was SVN r7215.
2005-09-07 13:40:22 +00:00
Tim Woodall
3e002203a0 dont need to adjust size
This commit was SVN r7213.
2005-09-07 13:25:05 +00:00
Brian Barrett
ed56e743b7 * update configure.ac to use the modern version of AC_INIT and
AM_INIT_AUTOMAKE, instead of the deprecated version.
* Work around dumbness in modern AC_INIT that requires the version
  number to be set at autoconf time (instead of at configure time, as
  it was before).  Set the version number, minus the subversion r number,
  at autoconf time.  Override the internal variables to include the r
  number (if needed) at configure time.  Basically, the right thing
  should always happen.  The only place it might not is the version
  reported as part of configure --help will not have an r number.
* Since AM_INIT_AUTOMAKE taks a list of options, no need to specify
  them in all the Makefile.am files.
* Addes support for subdir-objects, meaning that object files are put
  in the directory containing source files, even if the Makefile.am is
  in another directory.  This should start making it feasible to
  reduce the number of Makefile.am files we have in the tree, which
  will greatly reduce the time to run autogen and configure.

This commit was SVN r7211.
2005-09-07 05:54:53 +00:00
Galen Shipman
e5ea1b55ef fix for threaded build
This commit was SVN r7194.
2005-09-06 15:21:31 +00:00
Brian Barrett
6f19022db9 * Update Portals configuration to use --with-portals instead of
--with-btl-portals
* Update Red Storm build config file tomatch change

This commit was SVN r7185.
2005-09-05 21:02:50 +00:00
George Bosilca
3078be40aa First stable version of the MX BTL (at least we pass NetPipe). The perfs are not amazing
but are not that bad either.

On a 2 procs Intel(R) Xeon(TM) CPU 3.20GHz with MYRICOM Inc. Myrinet 2000 Scalable Cluster Interconnect (rev 04) I get:

  0:       1 bytes  13096 times -->      1.10 Mbps in       6.94 usec
  1:       2 bytes  14408 times -->      2.17 Mbps in       7.02 usec
  2:       3 bytes  14243 times -->      3.24 Mbps in       7.07 usec
  3:       4 bytes   9428 times -->      4.27 Mbps in       7.15 usec
  4:       6 bytes  10493 times -->      6.26 Mbps in       7.32 usec
  5:       8 bytes   6834 times -->      8.18 Mbps in       7.47 usec
  6:      12 bytes   8371 times -->     11.89 Mbps in       7.70 usec
  7:      13 bytes   5411 times -->     12.72 Mbps in       7.80 usec
  8:      16 bytes   5919 times -->     15.35 Mbps in       7.95 usec
  9:      19 bytes   7074 times -->     17.66 Mbps in       8.21 usec
 10:      21 bytes   7696 times -->     19.00 Mbps in       8.43 usec
 11:      24 bytes   7906 times -->     20.87 Mbps in       8.77 usec
 12:      27 bytes   8073 times -->     23.05 Mbps in       8.94 usec
 13:      29 bytes   4972 times -->     24.32 Mbps in       9.10 usec
 14:      32 bytes   5307 times -->     26.29 Mbps in       9.29 usec
 15:      35 bytes   5720 times -->     33.61 Mbps in       7.95 usec
 16:      45 bytes   7191 times -->     39.50 Mbps in       8.69 usec
 17:      48 bytes   7670 times -->     41.33 Mbps in       8.86 usec
 18:      51 bytes   7759 times -->     42.80 Mbps in       9.09 usec
 19:      61 bytes   4313 times -->     47.44 Mbps in       9.81 usec
 20:      64 bytes   5012 times -->     57.61 Mbps in       8.48 usec
 21:      67 bytes   6083 times -->     59.31 Mbps in       8.62 usec
 22:      93 bytes   6234 times -->     68.08 Mbps in      10.42 usec
 23:      96 bytes   6396 times -->     80.65 Mbps in       9.08 usec
 24:      99 bytes   7455 times -->     81.56 Mbps in       9.26 usec
 25:     125 bytes   3926 times -->    112.46 Mbps in       8.48 usec
 26:     128 bytes   5848 times -->    116.87 Mbps in       8.36 usec
 27:     131 bytes   6077 times -->    119.22 Mbps in       8.38 usec
 28:     189 bytes   6192 times -->    163.79 Mbps in       8.80 usec
 29:     192 bytes   7572 times -->    168.01 Mbps in       8.72 usec
 30:     195 bytes   7705 times -->    171.13 Mbps in       8.69 usec
 31:     253 bytes   4011 times -->    210.21 Mbps in       9.18 usec
 32:     256 bytes   5423 times -->    214.55 Mbps in       9.10 usec
 33:     259 bytes   5535 times -->    217.64 Mbps in       9.08 usec
 34:     381 bytes   5613 times -->    290.55 Mbps in      10.00 usec
 35:     384 bytes   6663 times -->    296.11 Mbps in       9.89 usec
 36:     387 bytes   6764 times -->    298.74 Mbps in       9.88 usec
 37:     509 bytes   3451 times -->    353.78 Mbps in      10.98 usec
 38:     512 bytes   4546 times -->    359.36 Mbps in      10.87 usec
 39:     515 bytes   4617 times -->    361.53 Mbps in      10.87 usec
 40:     765 bytes   4645 times -->    461.41 Mbps in      12.65 usec
 41:     768 bytes   5270 times -->    468.59 Mbps in      12.50 usec
 42:     771 bytes   5341 times -->    470.16 Mbps in      12.51 usec
 43:    1021 bytes   2695 times -->    508.42 Mbps in      15.32 usec
 44:    1024 bytes   3260 times -->    514.44 Mbps in      15.19 usec
 45:    1027 bytes   3298 times -->    515.72 Mbps in      15.19 usec
 46:    1533 bytes   3307 times -->    707.12 Mbps in      16.54 usec
 47:    1536 bytes   4030 times -->    714.93 Mbps in      16.39 usec
 48:    1539 bytes   4071 times -->    714.41 Mbps in      16.44 usec
 49:    2045 bytes   2040 times -->    761.38 Mbps in      20.49 usec
 50:    2048 bytes   2438 times -->    769.78 Mbps in      20.30 usec
 51:    2051 bytes   2465 times -->    769.78 Mbps in      20.33 usec
 52:    3069 bytes   2465 times -->    923.43 Mbps in      25.36 usec
 53:    3072 bytes   2629 times -->    928.48 Mbps in      25.24 usec
 54:    3075 bytes   2642 times -->    929.07 Mbps in      25.25 usec
 55:    4093 bytes   1323 times -->   1012.38 Mbps in      30.85 usec
 56:    4096 bytes   1620 times -->   1016.69 Mbps in      30.74 usec
 57:    4099 bytes   1627 times -->   1015.16 Mbps in      30.81 usec
 58:    6141 bytes   1625 times -->   1171.82 Mbps in      39.98 usec
 59:    6144 bytes   1667 times -->   1173.85 Mbps in      39.93 usec
 60:    6147 bytes   1669 times -->   1174.44 Mbps in      39.93 usec
 61:    8189 bytes    835 times -->   1232.43 Mbps in      50.69 usec
 62:    8192 bytes    986 times -->   1234.87 Mbps in      50.61 usec
 63:    8195 bytes    988 times -->   1234.85 Mbps in      50.63 usec
 64:   12285 bytes    988 times -->   1360.73 Mbps in      68.88 usec
 65:   12288 bytes    967 times -->   1364.20 Mbps in      68.72 usec
 66:   12291 bytes    970 times -->   1364.56 Mbps in      68.72 usec
 67:   16381 bytes    485 times -->   1385.48 Mbps in      90.21 usec
 68:   16384 bytes    554 times -->   1388.76 Mbps in      90.01 usec
 69:   16387 bytes    555 times -->   1388.41 Mbps in      90.05 usec
 70:   24573 bytes    555 times -->   1499.72 Mbps in     125.01 usec
 71:   24576 bytes    533 times -->   1499.36 Mbps in     125.05 usec
 72:   24579 bytes    533 times -->   1500.44 Mbps in     124.98 usec
 73:   32765 bytes    266 times -->   1499.31 Mbps in     166.73 usec
 74:   32768 bytes    299 times -->   1497.10 Mbps in     166.99 usec
 75:   32771 bytes    299 times -->   1495.29 Mbps in     167.21 usec
 76:   49149 bytes    299 times -->   1528.78 Mbps in     245.28 usec
 77:   49152 bytes    271 times -->   1527.97 Mbps in     245.42 usec
 78:   49155 bytes    271 times -->   1529.35 Mbps in     245.22 usec
 79:   65533 bytes    135 times -->   1586.19 Mbps in     315.21 usec
 80:   65536 bytes    158 times -->   1591.11 Mbps in     314.25 usec
 81:   65539 bytes    159 times -->   1586.50 Mbps in     315.17 usec
 82:   98301 bytes    158 times -->   1668.05 Mbps in     449.61 usec
 83:   98304 bytes    148 times -->   1667.40 Mbps in     449.80 usec
 84:   98307 bytes    148 times -->   1667.29 Mbps in     449.84 usec
 85:  131069 bytes     74 times -->   1709.11 Mbps in     585.09 usec
 86:  131072 bytes     85 times -->   1711.09 Mbps in     584.42 usec
 87:  131075 bytes     85 times -->   1710.92 Mbps in     584.49 usec
 88:  196605 bytes     85 times -->   1727.93 Mbps in     868.08 usec
 89:  196608 bytes     76 times -->   1726.28 Mbps in     868.92 usec
 90:  196611 bytes     76 times -->   1727.06 Mbps in     868.54 usec
 91:  262141 bytes     38 times -->   1757.65 Mbps in    1137.87 usec
 92:  262144 bytes     43 times -->   1758.69 Mbps in    1137.21 usec
 93:  262147 bytes     43 times -->   1759.38 Mbps in    1136.78 usec
 94:  393213 bytes     43 times -->   1801.51 Mbps in    1665.25 usec
 95:  393216 bytes     40 times -->   1803.26 Mbps in    1663.65 usec
 96:  393219 bytes     40 times -->   1800.73 Mbps in    1666.00 usec
 97:  524285 bytes     20 times -->   1805.33 Mbps in    2215.65 usec
 98:  524288 bytes     22 times -->   1806.80 Mbps in    2213.86 usec
 99:  524291 bytes     22 times -->   1805.77 Mbps in    2215.14 usec
100:  786429 bytes     22 times -->   1827.24 Mbps in    3283.64 usec
101:  786432 bytes     20 times -->   1827.03 Mbps in    3284.03 usec
102:  786435 bytes     20 times -->   1827.20 Mbps in    3283.73 usec
103: 1048573 bytes     10 times -->   1840.05 Mbps in    4347.71 usec
104: 1048576 bytes     11 times -->   1839.68 Mbps in    4348.58 usec
105: 1048579 bytes     11 times -->   1840.13 Mbps in    4347.54 usec
106: 1572861 bytes     11 times -->   1853.99 Mbps in    6472.50 usec
107: 1572864 bytes     10 times -->   1854.11 Mbps in    6472.10 usec
108: 1572867 bytes     10 times -->   1854.12 Mbps in    6472.10 usec
109: 2097149 bytes      5 times -->   1861.41 Mbps in    8595.61 usec
110: 2097152 bytes      5 times -->   1861.25 Mbps in    8596.40 usec
111: 2097155 bytes      5 times -->   1860.99 Mbps in    8597.59 usec
112: 3145725 bytes      5 times -->   1868.34 Mbps in   12845.59 usec
113: 3145728 bytes      5 times -->   1868.30 Mbps in   12845.90 usec
114: 3145731 bytes      5 times -->   1868.59 Mbps in   12843.89 usec
115: 4194301 bytes      3 times -->   1872.16 Mbps in   17092.51 usec
116: 4194304 bytes      3 times -->   1872.31 Mbps in   17091.19 usec
117: 4194307 bytes      3 times -->   1872.13 Mbps in   17092.82 usec
118: 6291453 bytes      3 times -->   1875.88 Mbps in   25588.00 usec
119: 6291456 bytes      3 times -->   1875.98 Mbps in   25586.68 usec
120: 6291459 bytes      3 times -->   1875.93 Mbps in   25587.36 usec
121: 8388605 bytes      3 times -->   1877.79 Mbps in   34082.69 usec
122: 8388608 bytes      3 times -->   1877.72 Mbps in   34083.84 usec
123: 8388611 bytes      3 times -->   1877.66 Mbps in   34085.00 usec

This commit was SVN r7180.
2005-09-04 22:08:13 +00:00
Tim Woodall
b65dc08ab1 counters need to be signed as we check for <0
This commit was SVN r7155.
2005-09-02 18:26:07 +00:00
Tim Woodall
dfe52fceef minor changes to thread locking
This commit was SVN r7154.
2005-09-02 16:27:01 +00:00
Galen Shipman
589b1b8b5a Additional changes to add_proc and tokens
This commit was SVN r7152.
2005-09-02 15:18:36 +00:00
Galen Shipman
a7a4da4502 Scale the SRQ based on the log base 2 of the number of peers,
this assumes that the peers have all been added via add_procs up front. 
Bad things will happen if add_procs is called again later on a new set of
 procs to fix this we need to modify the srq which may wreck things.. looking
 into this deeper.. 

This commit was SVN r7142.
2005-09-02 04:06:51 +00:00
Galen Shipman
c8a23106c0 More fixes for sq tokens,
Additional work on multi-rail support. 

This commit was SVN r7139.
2005-09-02 03:04:28 +00:00
Tim Woodall
636ab23fdb atomic increment/test
This commit was SVN r7130.
2005-09-01 15:09:50 +00:00
Jeff Squyres
3962c53e2e - Add to AM_CPPFLAGS $(OPAL_LTDL_CPPFLAGS) where necessary in order to
add a -I to find the included ltdl.h (vs. a system-installed ltdl.h)
- Clean up kruft in a bunch of Makefile.am's to remove now-unnecessary
  AM_CPPFLAGS settings to get static-components.h for each framework
- Move the component_repository API functions out of opal/mca/base/base.h
  and into opal/mca/base/mca_base_component_repository.h in order to
  decrease unnecessary dependencies (e.g., before this, almost
  everything in the tree depended on ltdl.h, which is unnecessary --
  only a small number of files really need ltdl.h)

This commit was SVN r7127.
2005-09-01 12:16:36 +00:00
Galen Shipman
29f7b4deda Changed send tokens to both send/rdma tokens for both low and high priority
queue pairs. Tested on intel p2p with 16 procs - Passed. 

This commit was SVN r7119.
2005-09-01 02:41:44 +00:00
Galen Shipman
c7e9563377 Added sender side per qp send tokens to limit the number of outstanding
sends. 

This commit was SVN r7112.
2005-08-31 20:28:42 +00:00
Galen Shipman
09873f299f Fixed a race in connection establishment..
This commit was SVN r7110.
2005-08-31 19:43:22 +00:00
Galen Shipman
00e0ff729d intialize free list to rr_buf_max, report async errors to user.
This commit was SVN r7095.
2005-08-30 16:44:38 +00:00
Tim Woodall
d34e299829 correctly decrement progress_event if tcp is not being
used so that tcp doesn't impact progress loop

This commit was SVN r7078.
2005-08-29 17:29:58 +00:00
Brian Barrett
173e062fbb * Spell LIBS as LIBS not LIBX ;)
This commit was SVN r7069.
2005-08-27 17:38:50 +00:00
Tim Woodall
5ed6f2c474 change flag to a sensible value
This commit was SVN r7056.
2005-08-26 20:21:07 +00:00
Tim Woodall
d57f3e1662 cleanup - handle request/prepare of zero bytes as special case
This commit was SVN r7055.
2005-08-26 20:19:11 +00:00
Galen Shipman
56f722c6c1 Removed all references to the old common/vapi stuff.
This commit was SVN r7029.
2005-08-25 15:04:22 +00:00
Galen Shipman
8c85aaf85a tell the user no mvapi modules are found..
This commit was SVN r7016.
2005-08-24 21:59:55 +00:00
Brian Barrett
1241645166 * don't ignore Portals BTL anymore
This commit was SVN r6971.
2005-08-22 03:52:07 +00:00
Tim Woodall
205af3af0a correct segment address
This commit was SVN r6942.
2005-08-19 20:20:27 +00:00
Galen Shipman
afdfa70f73 Added support for openib RDMA READ.. note that performance is currently an
issue so PUT is default.. We are determining if this is an openib issue or a
btl issue as we have seen performance increases on mvapi. 

This commit was SVN r6928.
2005-08-18 17:08:27 +00:00
Galen Shipman
589d53e828 only poll high priority qp 1 per progress.
This commit was SVN r6918.
2005-08-17 22:52:56 +00:00
Tim Woodall
f274f524ab - added get based protocol (if supported by btl) for pre-registered memory
- removed 8 bytes from the majority of the pml headers 

This commit was SVN r6916.
2005-08-17 18:23:38 +00:00
Galen Shipman
ee6999fa90 typo in threaded build..
This commit was SVN r6898.
2005-08-16 13:22:08 +00:00
Tim Woodall
9a094ee3b4 - return corrected size
- set send inplace for eager send

This commit was SVN r6891.
2005-08-15 21:30:47 +00:00
Galen Shipman
f248db3789 misc fixes, changes to support multiple mvapi btl's
This commit was SVN r6890.
2005-08-15 19:39:56 +00:00
Brian Barrett
e5eba51e9f * update rdma min size to be 64K. Seems to give best performance on RS.
This commit was SVN r6880.
2005-08-15 13:30:35 +00:00
Jeff Squyres
c465eb8567 Rename opal/threads/thread.h -> opal/threads/threads.h to avoid a
naming conflict with Solaris' <thread.h>

This commit was SVN r6879.
2005-08-15 11:02:01 +00:00
Brian Barrett
b9452e5afe * finish switch to sending in place
This commit was SVN r6876.
2005-08-15 02:31:45 +00:00
Galen Shipman
8e1e2eec3d Misc fixes for threaded builds..
This commit was SVN r6874.
2005-08-14 19:03:09 +00:00
Brian Barrett
f68ede1c93 * turn off debugging in the bowels of the Portals reference implementation
This commit was SVN r6864.
2005-08-14 02:05:23 +00:00
Brian Barrett
51531af9df * implement Portals btl get
This commit was SVN r6862.
2005-08-13 20:55:29 +00:00
Brian Barrett
c83cb66bf2 * Portals can send "in place"
This commit was SVN r6860.
2005-08-13 19:43:08 +00:00
Jeff Squyres
cf16a521c8 Ensure to get ompi/include/constants.h
This commit was SVN r6845.
2005-08-12 21:42:07 +00:00
Tim Woodall
faf146ec4c correct address count
This commit was SVN r6837.
2005-08-12 19:01:04 +00:00
Tim Woodall
5558c014b9 default TCP to only be used if self/sm/gm/mvapi.... are not available
This commit was SVN r6832.
2005-08-12 16:56:46 +00:00
Brian Barrett
2c44b2398d * remove the src/ directory for Portals BTL to bring it in line with the
other BTLs

This commit was SVN r6820.
2005-08-12 14:33:45 +00:00
Galen Shipman
b01ebf45c9 Fixed build error related to direct call (bml_direct_call.h). Misc bug fixes
and compiler warning issues. Fixed threaded build issue. 

This commit was SVN r6819.
2005-08-12 14:08:40 +00:00
Galen Shipman
c3c83aa3e1 BML (BTL Managment Layer). Allows BTL's to be used outside of the PML. See
bml.h and PML-OB1 for usage. 

This commit was SVN r6815.
2005-08-12 02:41:14 +00:00
Brian Barrett
d9e5e3343d * random code cleanup
This commit was SVN r6797.
2005-08-10 15:26:15 +00:00
Brian Barrett
d4aa9c702c * remove unused files
This commit was SVN r6796.
2005-08-10 15:11:05 +00:00
Tim Woodall
d458daf437 support for priority flag
This commit was SVN r6795.
2005-08-10 14:32:10 +00:00
Galen Shipman
73757b300c Added BTL_VERBOSE and OMPI_MCA_btl_base_debug , if set to 1 DEBUG output if
set to 2 VERBOSE output.. 

This commit was SVN r6783.
2005-08-09 17:49:39 +00:00
Tim Woodall
9b21413fe2 corrected newline
This commit was SVN r6781.
2005-08-09 16:24:48 +00:00
Tim Woodall
b4caa9f9f1 added verbose macro
This commit was SVN r6780.
2005-08-09 16:22:55 +00:00
Brian Barrett
88d316d443 * turn RDMA back on and wonder how I ever committed a patch turning it off...
This commit was SVN r6776.
2005-08-09 14:11:56 +00:00
Tim Woodall
078836c5b9 added put support for zero copy operation
This commit was SVN r6775.
2005-08-09 14:10:17 +00:00
Jeff Squyres
ba31fbf132 A better solution than r6672. If the caller passes in a data segment
alignment of 0, then assume there will be no data segment and don't do
the checks to see if it will be beyond the end of the file.

This commit was SVN r6773.

The following SVN revision numbers were found above:
  r6672 --> open-mpi/ompi@8b56769307
2005-08-08 21:38:27 +00:00
Jeff Squyres
1c5382deac - Fix a minor problem in alignment logic in sm common component
- Adjust btl sm to allocate just a few bytes extra to allow the common
  sm component to assume that there will be a data segment (even though
  the sm btl doesn't use the data segment in that portion of code)

This commit was SVN r6772.
2005-08-08 21:29:05 +00:00
Brian Barrett
6e2b07db91 * update Red Storm compat code to match today's changes
This commit was SVN r6771.
2005-08-08 21:22:15 +00:00
Brian Barrett
694bbc158f * Set max tag for BTLs to 255 not 256
* Major rework of Portals to better match Red Storm and hopefully get
  better performance:
  - Always assume there is only one module (since there are no machines
    on the planet with more than one Portals interface)
  - make progress all one function rather than dispatching to other
    functions and dispatch on event type, not comm type
  - remove polling of unneeded events

This commit was SVN r6769.
2005-08-08 20:56:26 +00:00
Galen Shipman
ba82bc11bc bug fixes and configure check for topspin directory structure..
This commit was SVN r6767.
2005-08-08 19:10:36 +00:00
George Bosilca
8b93cb7661 Rename all the functions starting with mca_base_modex to mca_pml_base_modex.
Change all the places where they are used to fit the new name.

Remove the code to check the remote arch from the PML. We will have a GPR mechanism
in ompi_mpi_initialize to do that.

This commit was SVN r6750.
2005-08-05 18:03:30 +00:00
Jeff Squyres
7678050a2f Grumble. Add *more* missing files...
This commit was SVN r6748.
2005-08-05 14:17:14 +00:00
Jeff Squyres
8ea1fec353 Add missing .h file
This commit was SVN r6744.
2005-08-05 10:30:47 +00:00
Brian Barrett
16e531e373 * fix some bad error messages to actually be useful
This commit was SVN r6741.
2005-08-04 19:28:59 +00:00
Brian Barrett
20d61b4599 * If rdma frag doesn't complete successfully on the receiving end, don't
call the cbfunc, since it's NULL.  The sending side will do the
  "right thing"

This commit was SVN r6735.
2005-08-04 15:45:31 +00:00
Brian Barrett
ab73cc0487 * minor diagnostic printf that should have been in last commit (doh!)
This commit was SVN r6734.
2005-08-04 15:43:50 +00:00
Brian Barrett
a80b00ab5e * Don't change size of user frag - it's not needed, and causes the frag
to never be returned to the free list

This commit was SVN r6733.
2005-08-04 15:43:13 +00:00
Brian Barrett
9cfa6bba6a * If a message isn't successfully sent, reduce the pending sends counter, as
the message is no longer pending
* Try to push out new messages whenever we finish a send, whether it
   worked or not.  Means that in the case where the other side has too
  many sends pending, we'll constantly retry one (and only one, once the
  pending number is reached) message until goodness returns
* Make some warnings only happen in verbose case, as they are mainly
  diagnostics

This commit was SVN r6732.
2005-08-04 15:41:11 +00:00
Jeff Squyres
b2cfedf805 Add copyright headers, ompi_config.h, and stdio.h.
This commit was SVN r6729.
2005-08-04 12:06:08 +00:00
Brian Barrett
26adbfe713 checkpoint to move back to RS
* remove dead code
* add some debugging code

This commit was SVN r6725.
2005-08-03 20:21:23 +00:00
Brian Barrett
6c37ad4471 * more components to ignore on RS
* fix comment

This commit was SVN r6724.
2005-08-03 16:08:27 +00:00
Brian Barrett
67f96f7b46 * convert to new param registration code
* Fix RDMA book keeping

This commit was SVN r6723.
2005-08-03 16:02:02 +00:00
Jeff Squyres
11140e9cb8 We must eliminate and stamp out all forms of redundancy, however they
may appear.

(remove *error.h file from Makefile.am -- a cut-n-paste error that has
propagated to a surprising number of directories ;-) )

This commit was SVN r6721.
2005-08-03 14:47:04 +00:00
Brian Barrett
eb2748130b * don't build TCP component if we don't have IP sockets :)
This commit was SVN r6716.
2005-08-02 20:06:34 +00:00
Brian Barrett
24116a3935 * fix up a bunch of threading issues when progress and/or mpi threads
are enabled.  Mostly just ADD32 -> ADD_SIZE_T issues and naming of
  variables in THREAD_{LOCK,UNLOCK}

This commit was SVN r6706.
2005-08-02 17:36:01 +00:00
Brian Barrett
4a748d62ae * add some missing header files for OS X
This commit was SVN r6702.
2005-08-02 14:59:50 +00:00
Tim Woodall
2214f0502d - first cut at tcp btl (working but not optimal)
- reworked btl error logging macros

This commit was SVN r6701.
2005-08-02 13:20:50 +00:00
George Bosilca
f8ccce7503 One step further.
This commit was SVN r6690.
2005-08-01 17:08:59 +00:00
Brian Barrett
41f7bb3a2a * undo removing of .ompi_unignore file (there is actually a .ompi_ignore
in this directory)

This commit was SVN r6688.
2005-07-29 23:56:48 +00:00
Brian Barrett
6528ee4692 * remove some useless printfs
This commit was SVN r6683.
2005-07-29 00:23:28 +00:00
George Bosilca
dc2a3d7917 We dont have a .ompi_ignore so I dont see why we have a .ompi_unignore.
This commit was SVN r6679.
2005-07-29 00:14:18 +00:00
Brian Barrett
cbf04e3d3f * if the frag isn't going to go, reduce the pending frags count, don't
increase it

This commit was SVN r6675.
2005-07-28 22:29:05 +00:00
George Bosilca
c8bc529df4 The second cut of MX ... still not working yet
This commit was SVN r6666.
2005-07-28 19:53:27 +00:00
Brian Barrett
5e75cb2495 * properly set unlink thresholds - START/END combined are 1 event
This commit was SVN r6662.
2005-07-28 19:28:04 +00:00
Brian Barrett
7441dfc4c3 fix some printfs
This commit was SVN r6660.
2005-07-28 19:15:07 +00:00
Brian Barrett
8a56cd567f * make poll time 0 so that our latency isn't way too high
* learn to spell...

This commit was SVN r6659.
2005-07-28 18:48:30 +00:00
Brian Barrett
05720c099f * use catamount header file
* fix some printfs

This commit was SVN r6654.
2005-07-28 17:09:23 +00:00
Brian Barrett
6cf88caeb4 * remove some unneeded printfs in Portals btl
* add some svn:ignores

This commit was SVN r6653.
2005-07-28 17:04:52 +00:00
Brian Barrett
3f09d5f2a4 * make btl open be safe to call multiple times (btl close already was)
* add btl back into ompi_info.  Since it now directly calls the
  open/close, the missing symbol problems Ralph was seeing when ob1 is
  ignored will not occur.

This commit was SVN r6652.
2005-07-28 16:31:29 +00:00
Rainer Keller
42f23932e0 In part revert 6647, btl_sm_fifo was in repos,
but not in Makefile.am

This commit was SVN r6651.
2005-07-28 16:25:09 +00:00
Brian Barrett
b0b6ddd078 * add --enable-heterogeneous (default: enabled) to enable heterogeneous
support in OMPI.  Currently only enables/disables the architecture
  sharing modex in ob1 pml.
* Add sds framework to ompi_info
* Figure out table ids to use for Portals BTL at configure time, since
  we should use 30 & 31 on Red Storm, but the reference implementation
  only supports 0-8.
* Some bug fixes in Portals UTCP sds

This commit was SVN r6650.
2005-07-28 16:16:13 +00:00
Rainer Keller
29465f0f28 There is no file btl_sm_fifo.h
This commit was SVN r6647.
2005-07-28 15:47:46 +00:00
Rainer Keller
6b7eb3b2d9 Add btl_sm_endpoint.h
This commit was SVN r6644.
2005-07-28 15:05:16 +00:00
Brian Barrett
052b4d4da4 * only give warning about removing -pedantic and -Wall if we are actually
going to build the component

This commit was SVN r6641.
2005-07-28 06:05:27 +00:00
Jeff Squyres
a7a9196350 There is no such file btl_gm_error.h.
This commit was SVN r6636.
2005-07-28 00:08:36 +00:00
George Bosilca
94fe5e6ac8 MX BTL use the new configure sub-system.
This commit was SVN r6633.
2005-07-27 23:38:31 +00:00
Brian Barrett
6aa464b67e More changes from Red Storm port
- only call sched_yield if it exists
  - don't fail out if modex doens't work in ob1
  - bunch of fixes for Portals BTL
  - add cnos rml component
  - add NULL gpr component (should only be used if replica AND proxy
    fail to load)  

This commit was SVN r6629.
2005-07-27 23:07:14 +00:00
George Bosilca
9603996a31 Self BTL is working just fine.
This commit was SVN r6624.
2005-07-27 21:46:44 +00:00
Rainer Keller
b9e092a9db There is no btl_mx_error.h, its not included anywhere.
This commit was SVN r6623.
2005-07-27 20:48:19 +00:00
George Bosilca
e1b3758fa5 The first cut for he MX BTL.
This commit was SVN r6621.
2005-07-27 19:46:36 +00:00
Galen Shipman
758e572ddb Use r_key for prepare_dst
This commit was SVN r6617.
2005-07-27 15:19:08 +00:00
Tim Woodall
a3dbe7687c enable btls
This commit was SVN r6616.
2005-07-27 15:01:05 +00:00
Galen Shipman
aa749499f8 Don't need to remove pedantic flags for openib.
This commit was SVN r6615.
2005-07-27 14:59:41 +00:00
Galen Shipman
115ef95c88 remove ompi_ignore files
add configure.m4 to openib btl and mpool. 

This commit was SVN r6613.
2005-07-27 13:42:37 +00:00
Galen Shipman
f3843bee55 Updated openib btl and mpool to use configure.m4
removed ompi_ignore files from openib btl and mpool. 

This commit was SVN r6612.
2005-07-27 03:38:25 +00:00
Brian Barrett
9a83910165 * Change Myrinet/gm btl and mpool to use configure.m4 instead of
configure.stub

This commit was SVN r6608.
2005-07-26 21:56:36 +00:00
Galen Shipman
9437cea964 Added support for shared receive queue, note that this is a run-time option
using OMPI_MCA_btl_mvapi_use_srq=1  and is disabled by default. 

This commit was SVN r6602.
2005-07-25 21:15:41 +00:00
Galen Shipman
e33a8205e8 Bugfix and use INLINE flag on send.
This commit was SVN r6600.
2005-07-25 14:57:33 +00:00
Tim Woodall
990c466a9f tuning
This commit was SVN r6595.
2005-07-22 21:33:16 +00:00
Galen Shipman
4c119def85 Misc fixes..
This commit was SVN r6583.
2005-07-21 20:26:17 +00:00
Brian Barrett
a4497238ad * per conversation with Tim, completely drain the Portals event queues each
time through component_progress()

This commit was SVN r6576.
2005-07-21 16:09:13 +00:00
Brian Barrett
0ee12467d6 * implement RDMA put
* remove the recv fragment code, since it really isn't needed
* handle memory descriptor binding a bit more sanely, and use
  thresholds so that Portals does the unlink for us, when it feels
  like it.

This commit was SVN r6575.
2005-07-21 16:06:46 +00:00
Brian Barrett
2faa0d179e * rename a bunch of constants to properly follow prefix rule
* change poll time from 10ms to 0ms.  bad brian
* change eager / min send limits to make ob1 a bit happier

This commit was SVN r6573.
2005-07-21 13:31:52 +00:00
Brian Barrett
ee3530b4bf * add me to gm unignores
This commit was SVN r6572.
2005-07-21 12:37:18 +00:00
Tim Woodall
3553333ef8 removed debug
This commit was SVN r6570.
2005-07-20 21:13:24 +00:00
Galen Shipman
fd969ac833 More code cleanup.. Also converted post receive requests to macros..
This commit was SVN r6566.
2005-07-20 17:43:31 +00:00
Galen Shipman
946402b980 More openib cleanup.. still note ready for public consumption ;-)
This commit was SVN r6565.
2005-07-20 15:17:18 +00:00
Brian Barrett
db4d993228 more code cleanup:
* change all the opal_output_verbose calls in the critical path to
    OPAL_OUTPUT_VERBOSE so that they are pre-processed out if debugging
    is not enabled
  * remove stub code

This commit was SVN r6564.
2005-07-20 15:02:56 +00:00
Brian Barrett
3ac83138c2 No real functionality changes, just a bunch of changes to make variable
names and the like more consistent throughout the code

This commit was SVN r6563.
2005-07-20 14:36:52 +00:00
Brian Barrett
cec83a8aba * add me to ompi_unignore
This commit was SVN r6557.
2005-07-20 03:29:08 +00:00
Brian Barrett
c95eacdff7 * add mode for utcp compat code where modex is not used. Instead, use the
"run-time" api for the reference implementation.
* Make the non-modex utcp and redstorm compat code do the same things in
  the same order

This commit was SVN r6556.
2005-07-20 02:49:48 +00:00
Brian Barrett
f7efce87d8 * need to check with Tim, but appears for a received fragment, everything
associated with the descriptor is ours again once the callback function
  returns.  Make it so - probably can optimize out some of the stuff I
  did when I mistakenly thought the descriptor free() was called on the
  passed descriptor
* Fix some dumb accounting errors with MD usage for unexpected receives

This commit was SVN r6555.
2005-07-20 01:24:43 +00:00
Galen Shipman
2f67ab82bb Working version of openib btl ;-)
Fixed receive descriptor counts that limited mvapi and openib to 2 procs.                                                   
Begin porting error messages to use the BTL_ERROR macro. 

This commit was SVN r6554.
2005-07-19 21:04:22 +00:00
Jeff Squyres
f09fb6fff4 Update Makefile.am's to get common sm component for symbol resolution.
This commit was SVN r6551.
2005-07-19 14:51:23 +00:00
Jeff Squyres
74744dd9df Fix a holdover mistake from the directory re-org:
- orte/class/ompi_proc_table.[ch] -> orte/class/orte_proc_table.[ch]
- opal_hash_table_[get|set|remove]_proc -> 
  orte_hash_table_[get|set|remove]_proc

This commit was SVN r6549.
2005-07-19 12:25:19 +00:00
Tim Woodall
efc5869b6b - correct typos
- change default buffering to support intel tests

This commit was SVN r6544.
2005-07-18 20:55:42 +00:00
Jeff Squyres
657d10187e Remove a little more kurft.
This commit was SVN r6534.
2005-07-15 21:51:07 +00:00
Galen Shipman
85cdef7abd correct leave_pinned bug
This commit was SVN r6533.
2005-07-15 21:08:36 +00:00
Tim Woodall
7fa40e84ae fix test against max send tokens
This commit was SVN r6531.
2005-07-15 20:56:29 +00:00
Jeff Squyres
f41e4149fa - Add new mpool base function: lookup by module name. This allows
multiple components to share a single mpool module (e.g., the
  ptl/btl and coll sm components).
- Re-tool the ptl, btl, and coll sm components to first look for the
  target mpool module, and if they don't find it, to create it.
- coll sm component now correctly identifies when it is supposed to
  run or not (i.e., if all the processes in the communicator are on
  the same host).  Now we just need to fill in some algorithms.  :-)

This commit was SVN r6530.
2005-07-15 20:01:35 +00:00
Galen Shipman
5af3cc8045 carryover mvapi mpool changes to openib
This commit was SVN r6525.
2005-07-15 16:05:05 +00:00
Galen Shipman
723a7b56ef Removed allocator from mpool_mvapi, moved is_leave_pinned to mpool_base,
corrected free and realloc in mpool. Added alloc_base to
mca_mpool_base_registration_t to be used as the actual alloc'd base address,
which may be different from the reported base address due to page allignment. 

This commit was SVN r6524.
2005-07-15 15:52:13 +00:00
Jeff Squyres
84bc5214e9 Convert sm btl to use new OMPI_PROC_FLAG_LOCAL instead of the modex.
This commit was SVN r6522.
2005-07-15 15:22:41 +00:00
Galen Shipman
b75560796c Fix up error handling in openib.. Added a simple debug test for memory
registration.. 

This commit was SVN r6520.
2005-07-15 15:13:19 +00:00
Brian Barrett
fe21bc111a sends seem to work as well as for sm - still seeing segfaults in various
IBM tests, but see same segfaults / assets in sm btl.

* Add prepare_src implementation so that we can send multiple fragments of
  large messages
* Add queuing of sends if either there are too many outstanding sends
  (we have to limit this so that we don't have more sends pending than
  we could get acks for) or if we get an ack with a 0 byte mlength,
  which means the remote side dropped the message on us.

Still need to valgrind to make sure I'm not leaking resources

This commit was SVN r6508.
2005-07-15 01:43:47 +00:00
Tim Woodall
70fb6fbe21 maintain mru list of registrations for leave pinned option
This commit was SVN r6505.
2005-07-14 22:27:11 +00:00
Galen Shipman
7e8c9289f3 Commented out unused frag variables... will remove completely soon
This commit was SVN r6501.
2005-07-14 21:52:55 +00:00
Tim Woodall
20917f8db0 implemented priority
This commit was SVN r6499.
2005-07-14 21:19:16 +00:00
Brian Barrett
2719a1c1d6 * rename function to match utcp version
This commit was SVN r6491.
2005-07-14 18:08:15 +00:00
Galen Shipman
a8f6ed7a51 Moved runtime params out of the module and into the component.
This commit was SVN r6484.
2005-07-14 14:31:23 +00:00
Brian Barrett
68b91e85ed * add checks for the hton and ntoh functions, since they don't exist on
Red Storm.  Add stub functions to ompi_config_bottom.h when they are
  around
* Add protection for a bunch of #include <netinet/in.h>s
* Fix up the Portals BTL so that it compiles on Red Storm and has the
  right mojo for initialization on Red Storm
* Add some important comments to ompi_check_package and mvapi configures
* Add support for platforms without getpwuid() (aka, Red Storm). 

This commit was SVN r6478.
2005-07-14 04:11:59 +00:00
Tim Woodall
262cda14cf attempt to move posting of buffers out of critical path
This commit was SVN r6469.
2005-07-13 21:39:41 +00:00
Galen Shipman
dcbda13a72 Various bug fixes..
This commit was SVN r6464.
2005-07-13 21:13:30 +00:00
Brian Barrett
17480a5965 * I've sprung yet another username, for the sandia machines
This commit was SVN r6460.
2005-07-13 16:15:32 +00:00
Brian Barrett
4a12652246 * fix bad typo in last commit :(
This commit was SVN r6455.
2005-07-13 04:16:56 +00:00
Brian Barrett
4d580fa706 * disable TCP ptl and oob components if there is no TCP support (look at
sockaddr_in - seems to be a good indicator)
* disable util/if code if no inet devices (again, no sockaddr_in)
* add enable/disable flag to disable stacktrace pretty-print code
  (defaults to enabled).  Seems there's something funky going on with
  the preprocessor on Red Storm that was causing problems - this was
  the easiest fix
* clean up a bunch of the configure.m4 files to remove bogus comments,
   properly comment them, fix the dumb logic for happy/unhappy
* Create a macro for testing both header and library for a package, 
  since we seem to do this kind of test quite often.  Handles the
  -I and -L search paths properly (including stripping out /usr and
  /usr/local if not needed)
* Converted mvapi components to configure.m4, using the nice new
  ompi_check_package macro (above)

This commit was SVN r6454.
2005-07-13 04:16:03 +00:00
Brian Barrett
586918853c * Turn thread support on by default, but disable both mpi and progress
threads (basically, same as before, but we now link the right thread
  libraries). 
* Add disable-io-romio flag to disable compiling ROMIO
* Migrathe mvapi btl from configure.stub to configure.m4

This commit was SVN r6453.
2005-07-13 01:07:31 +00:00
Galen Shipman
c1c4a5efba Compiling checkin of openib btl and mpool..
This commit was SVN r6452.
2005-07-13 00:17:08 +00:00
Galen Shipman
d7bdc46ac9 compile error and warining fixes for openib..
This commit was SVN r6449.
2005-07-12 21:49:30 +00:00
Galen Shipman
ed1da1a7c8 verbs.h not vapi.h --- doh
This commit was SVN r6444.
2005-07-12 19:07:32 +00:00
Galen Shipman
ac527bdc78 Modified btl names as appropriate..
This commit was SVN r6443.
2005-07-12 19:02:39 +00:00
Galen Shipman
2a2bb0c1e5 Modified the openib configure.stub files to search for openib libs in the
appropriate spot and also added a check for the libsysfs library required by openmpi. Modified the mvapi configure.stub to use AC_TRY_LINK for
pthreads. 

This commit was SVN r6441.
2005-07-12 18:06:54 +00:00
Josh Hursey
048d5c1415 Added some userlevel error checking, and messaging.
This commit was SVN r6440.
2005-07-12 18:06:31 +00:00
Galen Shipman
4286dae71b Added the appropriate ignore and unignore files...
This commit was SVN r6436.
2005-07-12 13:39:59 +00:00
Galen Shipman
454fdff824 Initial commit of changes to the mvapi btl to the openib btl. Still need to
work on the configure.stub to correctly locate the ib libraries. 

This commit was SVN r6435.
2005-07-12 13:38:54 +00:00
Brian Barrett
aca3abac5d * checkpoint for the evening
This commit was SVN r6423.
2005-07-11 21:00:08 +00:00
Brian Barrett
6e4f33e48c * after careful consideration, there's really no reason to force config.m4
components to succeed with --enable-dist.  Instead, just add them to
  all_components and make dist will still work - we're going to stamp out
  the Makefiles no matter what
* Add missing header to ob1 pml for make dist
* Clean up the Portals BTL configure code

This commit was SVN r6413.
2005-07-10 01:09:31 +00:00
Brian Barrett
a991d883c1 * Rewrite ompi_mca.m4 to use m4_defined lists of projects (ompi, orte, etc.),
frameworks, and components without configure scripts instead of
  hard-coded shell variables (for projects and frameworks) and 
  shell variable building (for components).
* Add 3rd category of component configuration (in addition to configure
  scripts and no-configured components): configure.m4 components.  These
  components can only be built as part of OMPI (like no-configure), but
  can provide an m4 file that is run as part of the main configure
  script.  These macros can set whether the component should be built, 
  along with just about any other configuration wanted.  More care must
  be taken compared to configure components, as doing things like setting
  variables or calling AC_MSG_ERROR now affects the top-level configure
  script (so calling AC_MSG_ERROR if your component can't configure
  probably isn't what you want)
* Added support to autogen.sh for the configure.m4-style components,
  as well as building up the m4_define lists ompi_mca.m4 now expects
* Updated a number of macros to be more config.cache friendly (both
  so that config.cache can be used and so the test can be quickly
  run multiple times in the same configrue script):
    - ompi_config_asm
    - c_weak_symbols
    - c_get_alignment
* Added new macros to be shared when configuring components:
    - ompi_objc.m4 (this actually provides AC_PROG_OBJC - don't ask...)
    - ompi_check_xgrid
    - ompi_check_tm
    - ompi_check_bproc
* Updated a number of components to use configure.m4 instead of
  configure.stub
    - btl portals
    - io romio
    - tm ras and pls
    - bjs, lsf_bproc ras and bproc_seed pls
    - xgrid ras and pls
    - null iof (used by tm) 

This commit was SVN r6412.
2005-07-09 18:52:53 +00:00
Brian Barrett
0ae16f2ab7 * add local hook to remove static-components.h in distclean target. The
files are generated by configure, and not part of the tarball, so
  distclean would be the right place to remove them.

This commit was SVN r6390.
2005-07-08 13:54:12 +00:00
Tim Woodall
a231d53666 corrections to frag size
This commit was SVN r6371.
2005-07-07 21:38:37 +00:00
Tim Woodall
eabdb860bc tuning
This commit was SVN r6370.
2005-07-07 20:58:57 +00:00
Tim Woodall
e0c8991a6e checkpoint
This commit was SVN r6367.
2005-07-07 16:56:58 +00:00
Brian Barrett
465e206216 * fill in some more of the chunk mamagement code (chunks == array of memory
used for the rolling recv buffers for non-RDMA messages)

This commit was SVN r6355.
2005-07-05 22:15:35 +00:00
Brian Barrett
d4bd7252a0 * checkpoint - added a bunch of infrastructure for sends
This commit was SVN r6353.
2005-07-05 21:14:29 +00:00
Brian Barrett
d9fa62d1f2 * checkpoint while it compiles
This commit was SVN r6349.
2005-07-05 16:29:57 +00:00
Brian Barrett
acce172a87 * after checking with Tim, add MCA_BTL_TAG_MAX constant - avoid having to
hard code 256 into all the module structs.
* Update template btl to match change

This commit was SVN r6348.
2005-07-05 14:03:49 +00:00
Jeff Squyres
ba99409628 Major simplifications to component versioning:
- After long discussions and ruminations on how we run components in
  LAM/MPI, made the decision that, by default, all components included
  in Open MPI will use the version number of their parent project
  (i.e., OMPI or ORTE).  They are certaint free to use a different
  number, but this simplification makes the common cases easy:
  - components are only released when the parent project is released
  - it is easy (trivial?) to distinguish which version component goes
    with with version of the parent project
- removed all autogen/configure code for templating the version .h
  file in components
- made all ORTE components use ORTE_*_VERSION for version numbers
- made all OMPI components use OMPI_*_VERSION for version numbers
- removed all VERSION files from components
- configure now displays OPAL, ORTE, and OMPI version numbers
- ditto for ompi_info
- right now, faking it -- OPAL and ORTE and OMPI will always have the
  same version number (i.e., they all come from the same top-level
  VERSION file).  But this paves the way for the Great Configure
  Reorganization, where, among other things, each project will have
  its own version number.

So all in all, we went from a boatload of version numbers to
[effectively] three.  That's pretty good.  :-)

This commit was SVN r6344.
2005-07-04 20:12:36 +00:00
Jeff Squyres
6a9c9953bc Remove a bunch of -I's that are no longer necessary with
properly-prefixed static-component.h files.

This commit was SVN r6342.
2005-07-04 18:24:58 +00:00
Brian Barrett
ed81e51c3a * rename ompi_printf to opal_printf
* rename ompi pty code to opal pty code
* rename ompi_qsort to opal_qsort

This commit was SVN r6335.
2005-07-04 02:16:57 +00:00
Brian Barrett
e55f99d23a * rename ompi_if to opal_if
* rename ompi_malloc to opal_malloc
* rename ompi_numtostr to opal_numtostr
* start of rename of ompi_environ to opal_environ

This commit was SVN r6332.
2005-07-04 01:36:20 +00:00
Brian Barrett
9f44b80291 * rename ompi_argv to opal_argv
* rename ompi_basename to opal_basename
* rename ompi bitop functions to opal
* rename ompi_cmd_line to opal_cmd_line
* rename ompi_sizet2int to opal_sizet2int
* rename orte_daemon_init to opal_daemon_init
* rename ompi_few to opal_few

This commit was SVN r6330.
2005-07-04 00:13:44 +00:00
Brian Barrett
a13166b500 * rename ompi_output to opal_output
This commit was SVN r6329.
2005-07-03 23:31:27 +00:00
Brian Barrett
23b687b0f4 * rename ompi_event to opal_event
This commit was SVN r6328.
2005-07-03 23:09:55 +00:00
Brian Barrett
39dbeeedfb * rename locking code from ompi to opal
This commit was SVN r6327.
2005-07-03 22:45:48 +00:00
Brian Barrett
9da0b4fe1d * rename all the atomic functions from ompi to opal
This commit was SVN r6325.
2005-07-03 21:38:51 +00:00
Brian Barrett
9f0c969bb4 * rename ompi_hash_table opal_hash_table
This commit was SVN r6324.
2005-07-03 16:52:32 +00:00
Brian Barrett
761402f95f * rename ompi_list to opal_list
This commit was SVN r6322.
2005-07-03 16:22:16 +00:00
Brian Barrett
499e4de1e7 * rename ompi_object and ompi_class to opal_object and opal_class
This commit was SVN r6321.
2005-07-03 16:06:07 +00:00
Brian Barrett
8cad33db40 * finish modex move
* fix protection in opal_free_list.h
* Fix some makefiles

This commit was SVN r6311.
2005-07-03 00:52:18 +00:00
Jeff Squyres
aa056f7bfd First cut of OMPI Makefile.am's, plus a few more catchup updates in orte
This commit was SVN r6286.
2005-07-02 15:06:47 +00:00
Jeff Squyres
4ab17f019b Rename src -> ompi
This commit was SVN r6269.
2005-07-02 13:43:57 +00:00