1
1

56 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
fa4c2af9ed THe Portals 4 reference implementation will sometimes return a NI_FLOWCTL for both a
send and an ack.  I'm not sure whether this violates the spec, so work around until
we decide...

This commit was SVN r27244.
2012-09-05 19:36:19 +00:00
Josh Hursey
28681deffa Backout the ORCA commit. :(
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.

This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
542330e3a7 Commit of ORCA: Open MPI Runtime Collaborative Abstraction
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.

The project is described on the wiki:
  https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition

And on this email thread:
  http://www.open-mpi.org/community/lists/devel/2012/06/11109.php

This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Brian Barrett
defaefd59e Clean up resources from flowcontrol on shutdown
This commit was SVN r26605.
2012-06-14 22:38:35 +00:00
Brian Barrett
946ec4cd97 * Update usage of PtlHandleIsEqual to match new semantic
* Properly set message to MPI_MESSAGE_NULL in the right places
* Fix double free of buffer for non-contiguous blocking sends
* Remove useless debugging output

This commit was SVN r26604.
2012-06-14 22:24:23 +00:00
Brian Barrett
31279eb641 Fix segfault with long expected messages when using the rndv protocol. We were
freeing the ME before the get to grab the long part of the message.

This commit was SVN r26589.
2012-06-11 16:37:01 +00:00
Brian Barrett
25693363e9 * Fix internal accounting error regarding number of available credits
* Use a single MD covering all of address space for put transfers, rather
 than a per-send MD.

This commit was SVN r26458.
2012-05-20 23:42:26 +00:00
Brian Barrett
2e52374847 * Split send and receive eq sizes
* Need to look at slot count before flowcontrol for sending to prevent
  race in restart
* Need to free pending request fragments when done with the request
* A number of branch prediction optimizations for error conditions

This commit was SVN r26430.
2012-05-10 21:43:48 +00:00
Brian Barrett
0ae2277796 Add a backoff mechanism for re-establishing communication
This commit was SVN r26366.
2012-05-01 15:53:00 +00:00
Brian Barrett
74ade8b181 need to order the pending list before we restart
This commit was SVN r26365.
2012-04-30 23:06:00 +00:00
Brian Barrett
5dec52af8d remove some now unneeded debugging
This commit was SVN r26364.
2012-04-30 22:50:52 +00:00
Brian Barrett
c654ee6afc * Use triggered operations for restart barrier as well
This commit was SVN r26363.
2012-04-30 22:48:10 +00:00
Brian Barrett
91a9973bde * Make flow control on by default
* Move alarm code back into a triggered operation

This commit was SVN r26362.
2012-04-30 22:25:40 +00:00
Brian Barrett
e6a0a1cf8a * Make sure to release all resources on failed send
* Avoid triggered ops until we get everything debugged
* Simplify flowctl interface a bit

This commit was SVN r26356.
2012-04-27 21:11:01 +00:00
Brian Barrett
8a70747da2 Fix some naming that doesn't make a ton of sense
This commit was SVN r26277.
2012-04-18 01:05:18 +00:00
Brian Barrett
f4d4e87176 add some flow control debugging output
This commit was SVN r26276.
2012-04-17 23:14:05 +00:00
Brian Barrett
fe0dfc8e26 First take at flow control protocol
This commit was SVN r26274.
2012-04-17 21:46:21 +00:00
Brian Barrett
dde6f094eb In preperation for flow control changes coming, always utilize ACKs for
message completion.

This commit was SVN r26272.
2012-04-16 17:25:27 +00:00
Brian Barrett
451af0e832 Ensure async progress for long unexpected messages by waiting for an
event on the ME.  The events we're likely to see are LINK (the ME was
added to the match list), PUT (weird to see first, but means that the ME
was linked to the match list and then matched), or PUT_OVERFLOW, meaning
the message was unexpected.

This commit was SVN r26199.
2012-03-26 22:54:35 +00:00
Brian Barrett
2a26d0f9a2 Forgot to add new file in the last commit.
Mark ME as invalid once we see a completion event, and look for events before
trying to unlink.

This commit was SVN r26198.
2012-03-26 22:39:05 +00:00
Brian Barrett
0e91084385 * Add type field to the request structure to deal with random user requests
(ie, cancel)
* Implement cancel for receives.  Sends are slightly more complicated...

This commit was SVN r26197.
2012-03-26 22:32:36 +00:00
Brian Barrett
cdaf110c0f * Implement mtl_send in addition to mtl_sendi
This commit was SVN r26193.
2012-03-26 19:19:11 +00:00
Brian Barrett
27c8f71773 Start of the flow control implementation. #defined out for now.
This commit was SVN r26192.
2012-03-26 01:31:58 +00:00
Brian Barrett
cce936b94c * Implement matched probe for the CM PML. Required adding a peer field to
the ompi_message_t structure to properly initialize convertor (the peer
  is available in the request in OB1, and wasn't needed when I did the
  original implementation).
* Implement matched probe for the Portals4 MTL and add NULL function pointers
  for the other MTLs.
* Add add_comm and del_comm functions to portals4 MTL so that direct call
  almost works again.
* Add NEWS item that we've implemented matched probe

This commit was SVN r26180.
2012-03-22 22:55:59 +00:00
Brian Barrett
4d12616b64 Frank pointed out that PTL_OK is zero and PtlHandleIsEqual either returns
PTL_OK or PTL_FAIL and that I had these backwards.

This commit was SVN r26179.
2012-03-22 15:58:00 +00:00
Brian Barrett
1c6b5a1358 * Set all appropriate flags for portal table entries
* split eq into send and receive eqs so that we can control the number
  of outstanding events in send eq and ensure we never lose an ack
* Shouldn't ever truncate on short unexpected receive bocks, so don't set
  the truncate bit
* Track active vs. waiting for free short unexpected receive blocks so
  to ensure an active short unexpected receive block is posted coming out
  of flow control.  Also allow creation of "temporary" blocks which should
  be released once FREE event is received.
* Slight reorganization of some code in preparation for more flow control
  work.

This commit was SVN r26174.
2012-03-21 22:20:55 +00:00
Brian Barrett
45a27e4f9f For now, ignore LINK event
This commit was SVN r25467.
2011-11-11 02:49:03 +00:00
Brian Barrett
d8b5b544ad Update list name to match change in spec
This commit was SVN r25273.
2011-10-12 20:09:39 +00:00
Brian Barrett
fc29ffebdb * remove two aborts that aren't necessary
This commit was SVN r25214.
2011-09-29 22:27:23 +00:00
Brian Barrett
14f32a1a54 * Clean up progress function
* Only print returnable errors when verbose=1.  Still print errors when
  we're going to abort, since those obviously aren't returnable

This commit was SVN r25213.
2011-09-29 22:26:33 +00:00
Brian Barrett
758f8a4d87 * More debugging output
* Make recv short block events use the callback mechanism so that can
  add overflow debugging

This commit was SVN r25212.
2011-09-29 21:59:48 +00:00
Brian Barrett
c08ea5c0f5 Set options correctly for the two pts
This commit was SVN r25211.
2011-09-29 21:56:37 +00:00
Brian Barrett
05f800abae Properly unpack data for long unexpected
This commit was SVN r25210.
2011-09-29 17:25:45 +00:00
Brian Barrett
bb9e73232a * Leverage hdr_data and opcount to improve debugging
* Clean up handling of short synchronous messages

This commit was SVN r25208.
2011-09-28 21:18:47 +00:00
Brian Barrett
71d8300607 * Fix name clash with macros in mtl_portals4.h
* hdr_data now includes opcount and length for all messages, which is the match
  bits for long and rndv messages
* Re-add probe implementation 

This commit was SVN r25207.
2011-09-28 16:53:01 +00:00
Brian Barrett
2fb8045fad clean up printfs
This commit was SVN r25206.
2011-09-28 15:28:46 +00:00
Brian Barrett
26e781f002 * Remove triggered code for now
* Move from per-endpoint send/recv count to just send side op count

This commit was SVN r25205.
2011-09-28 15:25:39 +00:00
Brian Barrett
592c1ab6db * revert probe and size information changes, since it seems to break everything
This commit was SVN r25204.
2011-09-28 14:57:19 +00:00
Brian Barrett
211b5c7824 * Make triggered protocol only work for non-wildcard receives
* Always encode length in header data to make probe work
* General send/receive cleanups
* Implement iprobe

This commit was SVN r25197.
2011-09-27 22:45:00 +00:00
Brian Barrett
77c560be42 updates to match new api changes
This commit was SVN r25196.
2011-09-27 20:38:22 +00:00
Mike Dubman
fd17f20ed5 Currently MTLs do no handle communicator contexts in any special way,
they only add the context id to the tag selection of the underlying 
messaging meachinsm. 
 
 We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager 
messages and rendezvous requests per-communicator basis.

 The MTL will be allowed to override comm->c_pml_comm member, 
since it's unused in pml_cm anyway. 

This commit was SVN r24858.
2011-07-06 18:25:49 +00:00
Brian Barrett
e8817f3f63 * Don't send acks for expected triggered messages; still need to get the rest of the data
* Don't ask for UNLINK events for persistent long unexpected ME or the get MEs.

This commit was SVN r24814.
2011-06-23 16:21:10 +00:00
Brian Barrett
09d89242d6 Crank up the number of short receive blocks so that we're unlikely to hit the flow
control case.  Uses about same amount of memory as the Portals 3.3 implementations

This commit was SVN r24782.
2011-06-16 21:58:53 +00:00
Brian Barrett
4fec0c198d updtae short recv blocks to properly setup for triggered operations (where
they also store the triggered start message)

This commit was SVN r24777.
2011-06-16 16:51:59 +00:00
Brian Barrett
83154af74d Check return codes a bit more closely
Fix broken debug output in any_source recv case
Other minor code cleanups

This commit was SVN r24774.
2011-06-13 15:18:55 +00:00
Brian Barrett
a7c682cdb0 Fix starting buffer point for triggered get. Should be after the eager part of the
message

This commit was SVN r24752.
2011-06-06 17:08:13 +00:00
Brian Barrett
b778d785fb Add some debugging output and fix some places where the output id and
verbosity level were swapped

This commit was SVN r24740.
2011-06-01 17:20:18 +00:00
Brian Barrett
37d5c7e2ca * Add ability to set long protocol with MCA parameter
* Instead of static arrays of send/recv counts, put them in the endpoint

This commit was SVN r24735.
2011-05-26 21:53:39 +00:00
Brian Barrett
beb1bc70b2 * Add support for using modex to exchange NID/PID pairs when using Portals4.
Rather than try to support a bunch of lightweight environments like I did
  with the Portals3 code, always use the "modex" and hack the grpcomm for
  the SHMEM implementation to return the right nid/pid for a remote
  process by "magic".

This commit was SVN r24733.
2011-05-25 22:10:27 +00:00
Brian Barrett
d8b7ea315e First take at implementing rndv and triggered protocols
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00