Brian Barrett
fa4c2af9ed
THe Portals 4 reference implementation will sometimes return a NI_FLOWCTL for both a
...
send and an ack. I'm not sure whether this violates the spec, so work around until
we decide...
This commit was SVN r27244.
2012-09-05 19:36:19 +00:00
Josh Hursey
28681deffa
Backout the ORCA commit. :(
...
There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk.
This commit was SVN r26676.
2012-06-27 01:28:28 +00:00
Josh Hursey
542330e3a7
Commit of ORCA: Open MPI Runtime Collaborative Abstraction
...
This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI.
The project is described on the wiki:
https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition
And on this email thread:
http://www.open-mpi.org/community/lists/devel/2012/06/11109.php
This commit was SVN r26670.
2012-06-26 21:42:16 +00:00
Brian Barrett
defaefd59e
Clean up resources from flowcontrol on shutdown
...
This commit was SVN r26605.
2012-06-14 22:38:35 +00:00
Brian Barrett
946ec4cd97
* Update usage of PtlHandleIsEqual to match new semantic
...
* Properly set message to MPI_MESSAGE_NULL in the right places
* Fix double free of buffer for non-contiguous blocking sends
* Remove useless debugging output
This commit was SVN r26604.
2012-06-14 22:24:23 +00:00
Brian Barrett
31279eb641
Fix segfault with long expected messages when using the rndv protocol. We were
...
freeing the ME before the get to grab the long part of the message.
This commit was SVN r26589.
2012-06-11 16:37:01 +00:00
Brian Barrett
25693363e9
* Fix internal accounting error regarding number of available credits
...
* Use a single MD covering all of address space for put transfers, rather
than a per-send MD.
This commit was SVN r26458.
2012-05-20 23:42:26 +00:00
Brian Barrett
2e52374847
* Split send and receive eq sizes
...
* Need to look at slot count before flowcontrol for sending to prevent
race in restart
* Need to free pending request fragments when done with the request
* A number of branch prediction optimizations for error conditions
This commit was SVN r26430.
2012-05-10 21:43:48 +00:00
Brian Barrett
0ae2277796
Add a backoff mechanism for re-establishing communication
...
This commit was SVN r26366.
2012-05-01 15:53:00 +00:00
Brian Barrett
74ade8b181
need to order the pending list before we restart
...
This commit was SVN r26365.
2012-04-30 23:06:00 +00:00
Brian Barrett
5dec52af8d
remove some now unneeded debugging
...
This commit was SVN r26364.
2012-04-30 22:50:52 +00:00
Brian Barrett
c654ee6afc
* Use triggered operations for restart barrier as well
...
This commit was SVN r26363.
2012-04-30 22:48:10 +00:00
Brian Barrett
91a9973bde
* Make flow control on by default
...
* Move alarm code back into a triggered operation
This commit was SVN r26362.
2012-04-30 22:25:40 +00:00
Brian Barrett
e6a0a1cf8a
* Make sure to release all resources on failed send
...
* Avoid triggered ops until we get everything debugged
* Simplify flowctl interface a bit
This commit was SVN r26356.
2012-04-27 21:11:01 +00:00
Brian Barrett
8a70747da2
Fix some naming that doesn't make a ton of sense
...
This commit was SVN r26277.
2012-04-18 01:05:18 +00:00
Brian Barrett
f4d4e87176
add some flow control debugging output
...
This commit was SVN r26276.
2012-04-17 23:14:05 +00:00
Brian Barrett
fe0dfc8e26
First take at flow control protocol
...
This commit was SVN r26274.
2012-04-17 21:46:21 +00:00
Brian Barrett
dde6f094eb
In preperation for flow control changes coming, always utilize ACKs for
...
message completion.
This commit was SVN r26272.
2012-04-16 17:25:27 +00:00
Brian Barrett
451af0e832
Ensure async progress for long unexpected messages by waiting for an
...
event on the ME. The events we're likely to see are LINK (the ME was
added to the match list), PUT (weird to see first, but means that the ME
was linked to the match list and then matched), or PUT_OVERFLOW, meaning
the message was unexpected.
This commit was SVN r26199.
2012-03-26 22:54:35 +00:00
Brian Barrett
2a26d0f9a2
Forgot to add new file in the last commit.
...
Mark ME as invalid once we see a completion event, and look for events before
trying to unlink.
This commit was SVN r26198.
2012-03-26 22:39:05 +00:00
Brian Barrett
0e91084385
* Add type field to the request structure to deal with random user requests
...
(ie, cancel)
* Implement cancel for receives. Sends are slightly more complicated...
This commit was SVN r26197.
2012-03-26 22:32:36 +00:00
Brian Barrett
cdaf110c0f
* Implement mtl_send in addition to mtl_sendi
...
This commit was SVN r26193.
2012-03-26 19:19:11 +00:00
Brian Barrett
27c8f71773
Start of the flow control implementation. #defined out for now.
...
This commit was SVN r26192.
2012-03-26 01:31:58 +00:00
Brian Barrett
cce936b94c
* Implement matched probe for the CM PML. Required adding a peer field to
...
the ompi_message_t structure to properly initialize convertor (the peer
is available in the request in OB1, and wasn't needed when I did the
original implementation).
* Implement matched probe for the Portals4 MTL and add NULL function pointers
for the other MTLs.
* Add add_comm and del_comm functions to portals4 MTL so that direct call
almost works again.
* Add NEWS item that we've implemented matched probe
This commit was SVN r26180.
2012-03-22 22:55:59 +00:00
Brian Barrett
4d12616b64
Frank pointed out that PTL_OK is zero and PtlHandleIsEqual either returns
...
PTL_OK or PTL_FAIL and that I had these backwards.
This commit was SVN r26179.
2012-03-22 15:58:00 +00:00
Brian Barrett
1c6b5a1358
* Set all appropriate flags for portal table entries
...
* split eq into send and receive eqs so that we can control the number
of outstanding events in send eq and ensure we never lose an ack
* Shouldn't ever truncate on short unexpected receive bocks, so don't set
the truncate bit
* Track active vs. waiting for free short unexpected receive blocks so
to ensure an active short unexpected receive block is posted coming out
of flow control. Also allow creation of "temporary" blocks which should
be released once FREE event is received.
* Slight reorganization of some code in preparation for more flow control
work.
This commit was SVN r26174.
2012-03-21 22:20:55 +00:00
Brian Barrett
45a27e4f9f
For now, ignore LINK event
...
This commit was SVN r25467.
2011-11-11 02:49:03 +00:00
Brian Barrett
d8b5b544ad
Update list name to match change in spec
...
This commit was SVN r25273.
2011-10-12 20:09:39 +00:00
Brian Barrett
fc29ffebdb
* remove two aborts that aren't necessary
...
This commit was SVN r25214.
2011-09-29 22:27:23 +00:00
Brian Barrett
14f32a1a54
* Clean up progress function
...
* Only print returnable errors when verbose=1. Still print errors when
we're going to abort, since those obviously aren't returnable
This commit was SVN r25213.
2011-09-29 22:26:33 +00:00
Brian Barrett
758f8a4d87
* More debugging output
...
* Make recv short block events use the callback mechanism so that can
add overflow debugging
This commit was SVN r25212.
2011-09-29 21:59:48 +00:00
Brian Barrett
c08ea5c0f5
Set options correctly for the two pts
...
This commit was SVN r25211.
2011-09-29 21:56:37 +00:00
Brian Barrett
05f800abae
Properly unpack data for long unexpected
...
This commit was SVN r25210.
2011-09-29 17:25:45 +00:00
Brian Barrett
bb9e73232a
* Leverage hdr_data and opcount to improve debugging
...
* Clean up handling of short synchronous messages
This commit was SVN r25208.
2011-09-28 21:18:47 +00:00
Brian Barrett
71d8300607
* Fix name clash with macros in mtl_portals4.h
...
* hdr_data now includes opcount and length for all messages, which is the match
bits for long and rndv messages
* Re-add probe implementation
This commit was SVN r25207.
2011-09-28 16:53:01 +00:00
Brian Barrett
2fb8045fad
clean up printfs
...
This commit was SVN r25206.
2011-09-28 15:28:46 +00:00
Brian Barrett
26e781f002
* Remove triggered code for now
...
* Move from per-endpoint send/recv count to just send side op count
This commit was SVN r25205.
2011-09-28 15:25:39 +00:00
Brian Barrett
592c1ab6db
* revert probe and size information changes, since it seems to break everything
...
This commit was SVN r25204.
2011-09-28 14:57:19 +00:00
Brian Barrett
211b5c7824
* Make triggered protocol only work for non-wildcard receives
...
* Always encode length in header data to make probe work
* General send/receive cleanups
* Implement iprobe
This commit was SVN r25197.
2011-09-27 22:45:00 +00:00
Brian Barrett
77c560be42
updates to match new api changes
...
This commit was SVN r25196.
2011-09-27 20:38:22 +00:00
Mike Dubman
fd17f20ed5
Currently MTLs do no handle communicator contexts in any special way,
...
they only add the context id to the tag selection of the underlying
messaging meachinsm.
We would like to enable an MTL to maintain its own context data
per-communicator. This way an MTL will be able to queue incoming eager
messages and rendezvous requests per-communicator basis.
The MTL will be allowed to override comm->c_pml_comm member,
since it's unused in pml_cm anyway.
This commit was SVN r24858.
2011-07-06 18:25:49 +00:00
Brian Barrett
e8817f3f63
* Don't send acks for expected triggered messages; still need to get the rest of the data
...
* Don't ask for UNLINK events for persistent long unexpected ME or the get MEs.
This commit was SVN r24814.
2011-06-23 16:21:10 +00:00
Brian Barrett
09d89242d6
Crank up the number of short receive blocks so that we're unlikely to hit the flow
...
control case. Uses about same amount of memory as the Portals 3.3 implementations
This commit was SVN r24782.
2011-06-16 21:58:53 +00:00
Brian Barrett
4fec0c198d
updtae short recv blocks to properly setup for triggered operations (where
...
they also store the triggered start message)
This commit was SVN r24777.
2011-06-16 16:51:59 +00:00
Brian Barrett
83154af74d
Check return codes a bit more closely
...
Fix broken debug output in any_source recv case
Other minor code cleanups
This commit was SVN r24774.
2011-06-13 15:18:55 +00:00
Brian Barrett
a7c682cdb0
Fix starting buffer point for triggered get. Should be after the eager part of the
...
message
This commit was SVN r24752.
2011-06-06 17:08:13 +00:00
Brian Barrett
b778d785fb
Add some debugging output and fix some places where the output id and
...
verbosity level were swapped
This commit was SVN r24740.
2011-06-01 17:20:18 +00:00
Brian Barrett
37d5c7e2ca
* Add ability to set long protocol with MCA parameter
...
* Instead of static arrays of send/recv counts, put them in the endpoint
This commit was SVN r24735.
2011-05-26 21:53:39 +00:00
Brian Barrett
beb1bc70b2
* Add support for using modex to exchange NID/PID pairs when using Portals4.
...
Rather than try to support a bunch of lightweight environments like I did
with the Portals3 code, always use the "modex" and hack the grpcomm for
the SHMEM implementation to return the right nid/pid for a remote
process by "magic".
This commit was SVN r24733.
2011-05-25 22:10:27 +00:00
Brian Barrett
d8b7ea315e
First take at implementing rndv and triggered protocols
...
This commit was SVN r24699.
2011-05-13 05:57:16 +00:00