openmpi

Автор	SHA1	Сообщение	Дата
Vasily Filipov	597a422272	MTL: make MXM work with read (in blocking send case) call-backs. This commit was SVN r26807.	2012-07-19 13:28:06 +00:00
George Bosilca	4326537fe9	Remove compiler warning about uninitialized variable. This commit was SVN r26760.	2012-07-08 00:07:52 +00:00
Yevgeny Kliteynik	0e28fa984b	Remove dead code that was related to ticket #2971 This commit was SVN r26701.	2012-07-02 11:19:09 +00:00
Jeff Squyres	5d030278e1	Refs trac:3130: Per comment 8 on the ticket, this MX patch fixes the cases where the MX BTL and MTL are stepping on each other regarding the mpool. Thanks to Yong Qin for assistance in tracking this down. This commit was SVN r26698. The following Trac tickets were found above: Ticket 3130 --> https://svn.open-mpi.org/trac/ompi/ticket/3130	2012-06-29 13:52:40 +00:00
Ralph Castain	0dfe29b1a6	Roll in the rest of the modex change. Eliminate all non-modex API access of RTE info from the MPI layer - in some cases, the info was already present (either in the ompi_proc_t or in the orte_process_info struct) and no call was necessary. This removes all calls to orte_ess from the MPI layer. Calls to orte_grpcomm remain required. Update all the orte ess components to remove their associated APIs for retrieving proc data. Update the grpcomm API to reflect transfer of set/get modex info to the db framework. Note that this doesn't recreate the old GPR. This is strictly a local db storage that may (at some point) obtain any missing data from the local daemon as part of an async methodology. The framework allows us to experiment with such methods without perturbing the default one. This commit was SVN r26678.	2012-06-27 14:53:55 +00:00
Josh Hursey	28681deffa	Backout the ORCA commit. :( There is a linking issue on Mac OSX that needs to be addressed before this is able to come back into the trunk. This commit was SVN r26676.	2012-06-27 01:28:28 +00:00
Josh Hursey	542330e3a7	Commit of ORCA: Open MPI Runtime Collaborative Abstraction This is a runtime interposition project that sits between the OMPI and ORTE layers in Open MPI. The project is described on the wiki: https://svn.open-mpi.org/trac/ompi/wiki/Runtime_Interposition And on this email thread: http://www.open-mpi.org/community/lists/devel/2012/06/11109.php This commit was SVN r26670.	2012-06-26 21:42:16 +00:00
Brian Barrett	defaefd59e	Clean up resources from flowcontrol on shutdown This commit was SVN r26605.	2012-06-14 22:38:35 +00:00
Brian Barrett	946ec4cd97	* Update usage of PtlHandleIsEqual to match new semantic * Properly set message to MPI_MESSAGE_NULL in the right places * Fix double free of buffer for non-contiguous blocking sends * Remove useless debugging output This commit was SVN r26604.	2012-06-14 22:24:23 +00:00
Brian Barrett	31279eb641	Fix segfault with long expected messages when using the rndv protocol. We were freeing the ME before the get to grab the long part of the message. This commit was SVN r26589.	2012-06-11 16:37:01 +00:00
Mike Dubman	10831e111a	detect num of local procs This commit was SVN r26555.	2012-06-05 09:13:16 +00:00
Yevgeny Kliteynik	1cbce83ece	Fixed wording of MXM parameters as suggested By Jeff. This commit was SVN r26545.	2012-06-03 21:48:42 +00:00
Yevgeny Kliteynik	f02bf707a4	Added MXM parameter "np" that controls the minimal number of processes that allow MXM to run Default: 128 MXM advantages kick in with large number of processes. This commit was SVN r26544.	2012-06-02 11:07:20 +00:00
Brian Barrett	2effbb1ba6	fix copy/paste typo This commit was SVN r26492.	2012-05-24 16:06:20 +00:00
Ralph Castain	c0304eb23a	Fix copy/paste typo This commit was SVN r26491.	2012-05-24 15:47:20 +00:00
Brian Barrett	25693363e9	* Fix internal accounting error regarding number of available credits * Use a single MD covering all of address space for put transfers, rather than a per-send MD. This commit was SVN r26458.	2012-05-20 23:42:26 +00:00
Brian Barrett	2e52374847	* Split send and receive eq sizes * Need to look at slot count before flowcontrol for sending to prevent race in restart * Need to free pending request fragments when done with the request * A number of branch prediction optimizations for error conditions This commit was SVN r26430.	2012-05-10 21:43:48 +00:00
Brian Barrett	0ae2277796	Add a backoff mechanism for re-establishing communication This commit was SVN r26366.	2012-05-01 15:53:00 +00:00
Brian Barrett	74ade8b181	need to order the pending list before we restart This commit was SVN r26365.	2012-04-30 23:06:00 +00:00
Brian Barrett	5dec52af8d	remove some now unneeded debugging This commit was SVN r26364.	2012-04-30 22:50:52 +00:00
Brian Barrett	c654ee6afc	* Use triggered operations for restart barrier as well This commit was SVN r26363.	2012-04-30 22:48:10 +00:00
Brian Barrett	91a9973bde	* Make flow control on by default * Move alarm code back into a triggered operation This commit was SVN r26362.	2012-04-30 22:25:40 +00:00
Brian Barrett	e6a0a1cf8a	* Make sure to release all resources on failed send * Avoid triggered ops until we get everything debugged * Simplify flowctl interface a bit This commit was SVN r26356.	2012-04-27 21:11:01 +00:00
Brian Barrett	8a70747da2	Fix some naming that doesn't make a ton of sense This commit was SVN r26277.	2012-04-18 01:05:18 +00:00
Brian Barrett	f4d4e87176	add some flow control debugging output This commit was SVN r26276.	2012-04-17 23:14:05 +00:00
Brian Barrett	fe0dfc8e26	First take at flow control protocol This commit was SVN r26274.	2012-04-17 21:46:21 +00:00
Brian Barrett	dde6f094eb	In preperation for flow control changes coming, always utilize ACKs for message completion. This commit was SVN r26272.	2012-04-16 17:25:27 +00:00
Mike Dubman	34acf769d4	mtl_mxm: support canceling messages This commit was SVN r26256.	2012-04-09 16:02:05 +00:00
Brian Barrett	451af0e832	Ensure async progress for long unexpected messages by waiting for an event on the ME. The events we're likely to see are LINK (the ME was added to the match list), PUT (weird to see first, but means that the ME was linked to the match list and then matched), or PUT_OVERFLOW, meaning the message was unexpected. This commit was SVN r26199.	2012-03-26 22:54:35 +00:00
Brian Barrett	2a26d0f9a2	Forgot to add new file in the last commit. Mark ME as invalid once we see a completion event, and look for events before trying to unlink. This commit was SVN r26198.	2012-03-26 22:39:05 +00:00
Brian Barrett	0e91084385	* Add type field to the request structure to deal with random user requests (ie, cancel) * Implement cancel for receives. Sends are slightly more complicated... This commit was SVN r26197.	2012-03-26 22:32:36 +00:00
Brian Barrett	61a090e0d1	Checking for NULL function pointers and direct-call semantics can't work together, so implement all functions in the MTL interface for all MTLs. The only places NULL was still being set was for add_comm/del_comm, and matched probe, both of which are straight forward to implement (or return ERROR_NOT_IMPLEMENTED, since the PML can't emulate matched probe). This commit was SVN r26194.	2012-03-26 19:27:03 +00:00
Brian Barrett	cdaf110c0f	* Implement mtl_send in addition to mtl_sendi This commit was SVN r26193.	2012-03-26 19:19:11 +00:00
Brian Barrett	27c8f71773	Start of the flow control implementation. #defined out for now. This commit was SVN r26192.	2012-03-26 01:31:58 +00:00
Brian Barrett	cce936b94c	* Implement matched probe for the CM PML. Required adding a peer field to the ompi_message_t structure to properly initialize convertor (the peer is available in the request in OB1, and wasn't needed when I did the original implementation). * Implement matched probe for the Portals4 MTL and add NULL function pointers for the other MTLs. * Add add_comm and del_comm functions to portals4 MTL so that direct call almost works again. * Add NEWS item that we've implemented matched probe This commit was SVN r26180.	2012-03-22 22:55:59 +00:00
Brian Barrett	4d12616b64	Frank pointed out that PTL_OK is zero and PtlHandleIsEqual either returns PTL_OK or PTL_FAIL and that I had these backwards. This commit was SVN r26179.	2012-03-22 15:58:00 +00:00
Brian Barrett	1c6b5a1358	* Set all appropriate flags for portal table entries * split eq into send and receive eqs so that we can control the number of outstanding events in send eq and ensure we never lose an ack * Shouldn't ever truncate on short unexpected receive bocks, so don't set the truncate bit * Track active vs. waiting for free short unexpected receive blocks so to ensure an active short unexpected receive block is posted coming out of flow control. Also allow creation of "temporary" blocks which should be released once FREE event is received. * Slight reorganization of some code in preparation for more flow control work. This commit was SVN r26174.	2012-03-21 22:20:55 +00:00
Mike Dubman	bd7abd72a9	in mca_mtl_mxm, don't allow negative tags for MPI_ANY_TAG This commit was SVN r26128.	2012-03-09 22:11:14 +00:00
Mike Dubman	540b3c0c25	update mxm mtl to changes in mxm api This commit was SVN r26073.	2012-02-29 22:02:34 +00:00
Mike Dubman	81bd5eee8d	in mxm, use sender_len field and not actual_len when returning result from probe This commit was SVN r25993.	2012-02-21 19:55:16 +00:00
Mike Dubman	6ec768f0c6	fix #2971 This commit was SVN r25908.	2012-02-12 09:28:42 +00:00
Mike Dubman	b18a1611c3	- if everything is ok set return value to OMPI_SUCCESS in mtl/mxm This commit was SVN r25879.	2012-02-08 14:19:58 +00:00
Mike Dubman	6188ab7317	* ep init refactoring * split ep_info into fragments to fit PMI limit This commit was SVN r25857.	2012-02-02 15:00:47 +00:00
Mike Dubman	92873872f5	revert r25813 This commit was SVN r25816. The following SVN revision numbers were found above: r25813 --> open-mpi/ompi@8ed781d7e9	2012-01-30 13:22:38 +00:00
Mike Dubman	8ed781d7e9	add mca param to enable/disable mxm This commit was SVN r25813.	2012-01-30 11:14:20 +00:00
Mike Dubman	9f0ca9dfc0	fix: extract source from imm request fields instead from depricated api This commit was SVN r25812.	2012-01-30 10:37:37 +00:00
Mike Dubman	6c954ad43f	set mxm to call opal_progress in tight loops This commit was SVN r25788.	2012-01-26 18:33:43 +00:00
Mike Dubman	37dc53bbc9	mxm: return the MXM_REQ_SEND_SYNC flag to mxm_req_send This commit was SVN r25694.	2012-01-06 18:56:28 +00:00
Mike Dubman	3b97d609a8	mtl_mxm: fix double free This commit was SVN r25693.	2012-01-06 16:22:58 +00:00
Brian Barrett	45a27e4f9f	For now, ignore LINK event This commit was SVN r25467.	2011-11-11 02:49:03 +00:00
Mike Dubman	00c27afd52	fix pid This commit was SVN r25463.	2011-11-09 17:53:59 +00:00
Mike Dubman	71398b658e	fix: OMPI_ERR_CONNECTION_FAILED available in v1.5, unavailable in trunk This commit was SVN r25459.	2011-11-08 12:34:01 +00:00
Mike Dubman	4cf9e1323d	fix: return correct error on connection failure This commit was SVN r25452.	2011-11-07 06:13:17 +00:00
Mike Dubman	7595a80a63	fix self pid This commit was SVN r25424.	2011-11-03 06:46:20 +00:00
Mike Dubman	3edd77ea25	update mxm plugin to mxm api change: pass synchronous request as an opcode instead of a flag This commit was SVN r25403.	2011-10-31 22:36:15 +00:00
Mike Dubman	6b50ba22a6	select mxm ptl based on user preferences This commit was SVN r25401.	2011-10-31 10:17:43 +00:00
Mike Dubman	f96ae43e23	pass jobid to mxm/sm module This commit was SVN r25375.	2011-10-27 13:14:52 +00:00
Mike Dubman	9ffeeb69d9	fix help message This commit was SVN r25364.	2011-10-25 14:02:43 +00:00
Samuel Gutierrez	663f4546f5	fix define typo in psm mtl. This commit was SVN r25362.	2011-10-24 18:38:12 +00:00
Brian Barrett	d8b5b544ad	Update list name to match change in spec This commit was SVN r25273.	2011-10-12 20:09:39 +00:00
Mike Dubman	7a9ae43276	added support for shared memory transport in mxm This commit was SVN r25220.	2011-10-03 12:59:55 +00:00
Brian Barrett	fc29ffebdb	* remove two aborts that aren't necessary This commit was SVN r25214.	2011-09-29 22:27:23 +00:00
Brian Barrett	14f32a1a54	* Clean up progress function * Only print returnable errors when verbose=1. Still print errors when we're going to abort, since those obviously aren't returnable This commit was SVN r25213.	2011-09-29 22:26:33 +00:00
Brian Barrett	758f8a4d87	* More debugging output * Make recv short block events use the callback mechanism so that can add overflow debugging This commit was SVN r25212.	2011-09-29 21:59:48 +00:00
Brian Barrett	c08ea5c0f5	Set options correctly for the two pts This commit was SVN r25211.	2011-09-29 21:56:37 +00:00
Brian Barrett	05f800abae	Properly unpack data for long unexpected This commit was SVN r25210.	2011-09-29 17:25:45 +00:00
Brian Barrett	bb9e73232a	* Leverage hdr_data and opcount to improve debugging * Clean up handling of short synchronous messages This commit was SVN r25208.	2011-09-28 21:18:47 +00:00
Brian Barrett	71d8300607	* Fix name clash with macros in mtl_portals4.h * hdr_data now includes opcount and length for all messages, which is the match bits for long and rndv messages * Re-add probe implementation This commit was SVN r25207.	2011-09-28 16:53:01 +00:00
Brian Barrett	2fb8045fad	clean up printfs This commit was SVN r25206.	2011-09-28 15:28:46 +00:00
Brian Barrett	26e781f002	* Remove triggered code for now * Move from per-endpoint send/recv count to just send side op count This commit was SVN r25205.	2011-09-28 15:25:39 +00:00
Brian Barrett	592c1ab6db	* revert probe and size information changes, since it seems to break everything This commit was SVN r25204.	2011-09-28 14:57:19 +00:00
Brian Barrett	211b5c7824	* Make triggered protocol only work for non-wildcard receives * Always encode length in header data to make probe work * General send/receive cleanups * Implement iprobe This commit was SVN r25197.	2011-09-27 22:45:00 +00:00
Brian Barrett	77c560be42	updates to match new api changes This commit was SVN r25196.	2011-09-27 20:38:22 +00:00
Mike Dubman	98f382ba0e	fixes in mxm mtl This commit was SVN r25066.	2011-08-19 22:18:17 +00:00
Mike Dubman	e3c869d83b	fix double free This commit was SVN r25041.	2011-08-10 05:47:55 +00:00
Mike Dubman	a751cd93d3	improve debug macro availability This commit was SVN r25022.	2011-08-09 10:54:08 +00:00
Mike Dubman	bfd75de6f9	fix selection logic: if no suitable device found - disqulaify mxm w/o complains. This commit was SVN r25021.	2011-08-09 07:09:37 +00:00
Mike Dubman	1d3f5e1314	better mxm selection mechanism, some refactoring This commit was SVN r25005.	2011-08-07 12:06:49 +00:00
Mike Dubman	7b18ab2fa9	remove unused includes This commit was SVN r24980.	2011-08-03 07:07:29 +00:00
Mike Dubman	45ea375531	code and readme updates, some refactoring This commit was SVN r24977.	2011-08-02 14:30:11 +00:00
Mike Dubman	aefffa073d	initial implementation of MXM MTL layer This commit was SVN r24946.	2011-07-26 04:36:21 +00:00
Mike Dubman	fd17f20ed5	Currently MTLs do no handle communicator contexts in any special way, they only add the context id to the tag selection of the underlying messaging meachinsm. We would like to enable an MTL to maintain its own context data per-communicator. This way an MTL will be able to queue incoming eager messages and rendezvous requests per-communicator basis. The MTL will be allowed to override comm->c_pml_comm member, since it's unused in pml_cm anyway. This commit was SVN r24858.	2011-07-06 18:25:49 +00:00
Brian Barrett	e8817f3f63	* Don't send acks for expected triggered messages; still need to get the rest of the data * Don't ask for UNLINK events for persistent long unexpected ME or the get MEs. This commit was SVN r24814.	2011-06-23 16:21:10 +00:00
Brian Barrett	09d89242d6	Crank up the number of short receive blocks so that we're unlikely to hit the flow control case. Uses about same amount of memory as the Portals 3.3 implementations This commit was SVN r24782.	2011-06-16 21:58:53 +00:00
Brian Barrett	4fec0c198d	updtae short recv blocks to properly setup for triggered operations (where they also store the triggered start message) This commit was SVN r24777.	2011-06-16 16:51:59 +00:00
Brian Barrett	83154af74d	Check return codes a bit more closely Fix broken debug output in any_source recv case Other minor code cleanups This commit was SVN r24774.	2011-06-13 15:18:55 +00:00
Brian Barrett	a7c682cdb0	Fix starting buffer point for triggered get. Should be after the eager part of the message This commit was SVN r24752.	2011-06-06 17:08:13 +00:00
Brian Barrett	b778d785fb	Add some debugging output and fix some places where the output id and verbosity level were swapped This commit was SVN r24740.	2011-06-01 17:20:18 +00:00
Brian Barrett	37d5c7e2ca	* Add ability to set long protocol with MCA parameter * Instead of static arrays of send/recv counts, put them in the endpoint This commit was SVN r24735.	2011-05-26 21:53:39 +00:00
Brian Barrett	beb1bc70b2	* Add support for using modex to exchange NID/PID pairs when using Portals4. Rather than try to support a bunch of lightweight environments like I did with the Portals3 code, always use the "modex" and hack the grpcomm for the SHMEM implementation to return the right nid/pid for a remote process by "magic". This commit was SVN r24733.	2011-05-25 22:10:27 +00:00
Brian Barrett	d8b7ea315e	First take at implementing rndv and triggered protocols This commit was SVN r24699.	2011-05-13 05:57:16 +00:00
Brian Barrett	43902221cc	* Fix bad argument to PtlGet in long receive * Fix bad params when configuring ME for long unexpected This commit was SVN r24698.	2011-05-13 03:56:03 +00:00
Brian Barrett	3d4b7ecbaf	updates for API changes This commit was SVN r24628.	2011-04-20 16:48:27 +00:00
George Bosilca	6fc4c22037	Pedantic. This commit was SVN r24460.	2011-02-25 00:29:48 +00:00
Brian Barrett	4859bb82e2	* Update to support direct call * Add missing cancel (not that it does anything useful) * Fix bug in opal_output call This commit was SVN r24269.	2011-01-19 20:49:28 +00:00
Brian Barrett	6cf74eeb03	Fix bug in looking at convertor_unpack return code. Always print debug on error message for now. This commit was SVN r24163.	2010-12-10 22:36:47 +00:00
Brian Barrett	a26fadb26e	Bring Portals4 updates back into the trunk This commit was SVN r24154.	2010-12-07 20:11:25 +00:00
Nathan Hjelm	94d4aa7253	fixed wrong include This commit was SVN r24133.	2010-12-01 23:10:12 +00:00
Greg Koenig	0694a3203b	This was a small mistake introduced in r23925 in the changes to libevent. This commit was SVN r24043. The following SVN revision numbers were found above: r23925 --> open-mpi/ompi@fceabb2498	2010-11-11 21:54:28 +00:00
Jeff Squyres	64863d086c	Add 2 new MCA params: * mtl_mx_board: allow selection of specific MX NIC/board to use. <0 means "use any board". * mtl_mx_endpoint: allow selection of specific MX endpoint to use. <0 means "use any endpoint". This commit was SVN r23996.	2010-11-05 17:17:20 +00:00
Ralph Castain	fceabb2498	Update libevent to the 2.0 series, currently at 2.0.7rc. We will update to their final release when it becomes available. Currently known errors exist in unused portions of the libevent code. This revision passes the IBM test suite on a Linux machine and on a standalone Mac. This is a fairly intrusive change, but outside of the moving of opal/event to opal/mca/event, the only changes involved (a) changing all calls to opal_event functions to reflect the new framework instead, and (b) ensuring that all opal_event_t objects are properly constructed since they are now true opal_objects. Note: Shiqing has just returned from vacation and has not yet had a chance to complete the Windows integration. Thus, this commit almost certainly breaks Windows support on the trunk. However, I want this to have a chance to soak for as long as possible before I become less available a week from today (going to be at a class for 5 days, and thus will only be sparingly available) so we can find and fix any problems. Biggest change is moving the libevent code from opal/event to a new opal/mca/event framework. This was done to make it much easier to update libevent in the future. New versions can be inserted as a new component and tested in parallel with the current version until validated, then we can remove the earlier version if we so choose. This is a statically built framework ala installdirs, so only one component will build at a time. There is no selection logic - the sole compiled component simply loads its function pointers into the opal_event struct. I have gone thru the code base and converted all the libevent calls I could find. However, I cannot compile nor test every environment. It is therefore quite likely that errors remain in the system. Please keep an eye open for two things: 1. compile-time errors: these will be obvious as calls to the old functions (e.g., opal_evtimer_new) must be replaced by the new framework APIs (e.g., opal_event.evtimer_new) 2. run-time errors: these will likely show up as segfaults due to missing constructors on opal_event_t objects. It appears that it became a typical practice for people to "init" an opal_event_t by simply using memset to zero it out. This will no longer work - you must either OBJ_NEW or OBJ_CONSTRUCT an opal_event_t. I tried to catch these cases, but may have missed some. Believe me, you'll know when you hit it. There is also the issue of the new libevent "no recursion" behavior. As I described on a recent email, we will have to discuss this and figure out what, if anything, we need to do. This commit was SVN r23925.	2010-10-24 18:35:54 +00:00
Brian Barrett	9febaa475e	* Add shell of functionality required for supporting Portals4 * Update places where orte-free builds have failed This commit was SVN r23891.	2010-10-14 22:49:09 +00:00
Jeff Squyres	73bcc4a36b	Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers This commit was SVN r23801. The following SVN revision numbers were found above: r23764 --> open-mpi/ompi@40a2bfa238	2010-09-24 22:53:28 +00:00
Ralph Castain	40a2bfa238	WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues. This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change. Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation. This commit was SVN r23764.	2010-09-17 23:04:06 +00:00
Rolf vandeVaart	0324fdb407	Created two new macros that are used when filling in either the status structure or the _ucount field in the status structure. On 64-bit sparc, the macros resolve into integer array assignments. For all others, they are just simple assignments. This fixes possible BUS errors seen when running on the SPARC processor. This bug was introduced when the _count field changed from an int into a size_t. See the changes to request.h for additional details. This commit fixes trac:2514. This commit was SVN r23554. The following Trac tickets were found above: Ticket 2514 --> https://svn.open-mpi.org/trac/ompi/ticket/2514	2010-08-04 19:36:40 +00:00
George Bosilca	733d25a8a3	First step toward fixing the MPI_Get_count issues from the ticket #2241 . Next step is the configure and Fortran mojo that Jeff will put in. Until then I guess the Fortran interface is broken (at least all functions using the hidden count firld in the MPI_Status). This commit was SVN r23467.	2010-07-21 20:07:00 +00:00
Jeff Squyres	c8bb7537e7	Remove include/opal/sys/cache.h -- its only purpose in life was to #define CACHE_LINE_SIZE to 128. This name has a conflict on NetBSD, and it seems kinda odd to have a header file that ''only'' defines a single value. Also, we'll soon be raising hwloc to be a first-class item, so having this file around seemed kinda weird. Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the rationale that we can fill this in at runtime with hwloc info (trunk and v1.5/beyond, only). The only place we ''needed'' a compile-time CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value (128). That use isn't suitable for run-time hwloc information, anyway. This commit was SVN r23349.	2010-07-06 14:33:36 +00:00
Avneesh Pant	8bdd334d95	Allow the PSM component to return ERR_NOT_AVAIL so it can be unloaded silently if executed on a node with no QLogic IB hardware. Also minor modifications to have the CM PML allow itself to be unloaded if no MTL components are available. The component selection logic can then continue to use other PMLs. This commit was SVN r22410.	2010-01-14 19:39:35 +00:00
Avneesh Pant	774b965784	Add in support to specify IB path record query mechanism and IB Application/Service ID for PSM MTL. Also fix a minor bug in calculating the minimum connection timeout. This commit was SVN r22397.	2010-01-13 18:58:00 +00:00
Rainer Keller	8e1b23779f	- Replace combinations of #if defined (c_plusplus) defined (__cplusplus) followed by extern "C" { and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS. Notable exceptions are: - opal/include/opal_config_bottom.h: This is our generated code, that itself defines BEGIN_C_DECL and END_C_DECL - ompi/mpi/cxx/mpicxx.h: Here we do not include opal_config_bottom.h: - Belongs to external code: opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h - opal/include/opal/prefetch.h: Has C++ specific macros that are protected: - Had #if ... } #endif _and_ END_C_DECLS (aka end up with 2x END_C_DECLS) ompi/mca/btl/openib/btl_openib.h - opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS... - opal/win32/ompi_process.h: had extern "C"\n {... opal/win32/ompi_process.h: dito - ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS ompi/mpi/f90/test/align_c.c: dito - ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus - ompi/mpi/f90/xml/common-C.xsl: Amend Tested on linux using --with-openib and --with-mx The following do not contain either opal_config.h, orte_config.h or ompi_config.h (but possibly other header files, that include one of the above): ompi/mca/bml/r2/bml_r2_ft.h ompi/mca/btl/gm/btl_gm_endpoint.h ompi/mca/btl/gm/btl_gm_proc.h ompi/mca/btl/mx/btl_mx_endpoint.h ompi/mca/btl/ofud/btl_ofud_endpoint.h ompi/mca/btl/ofud/btl_ofud_frag.h ompi/mca/btl/ofud/btl_ofud_proc.h ompi/mca/btl/openib/btl_openib_mca.h ompi/mca/btl/portals/btl_portals_endpoint.h ompi/mca/btl/portals/btl_portals_frag.h ompi/mca/btl/sctp/btl_sctp_endpoint.h ompi/mca/btl/sctp/btl_sctp_proc.h ompi/mca/btl/tcp/btl_tcp_endpoint.h ompi/mca/btl/tcp/btl_tcp_ft.h ompi/mca/btl/tcp/btl_tcp_proc.h ompi/mca/btl/template/btl_template_endpoint.h ompi/mca/btl/template/btl_template_proc.h ompi/mca/btl/udapl/btl_udapl_eager_rdma.h ompi/mca/btl/udapl/btl_udapl_endpoint.h ompi/mca/btl/udapl/btl_udapl_mca.h ompi/mca/btl/udapl/btl_udapl_proc.h ompi/mca/mtl/mx/mtl_mx_endpoint.h ompi/mca/mtl/mx/mtl_mx.h ompi/mca/mtl/psm/mtl_psm_endpoint.h ompi/mca/mtl/psm/mtl_psm.h ompi/mca/pml/cm/pml_cm_component.h ompi/mca/pml/csum/pml_csum_comm.h ompi/mca/pml/dr/pml_dr_comm.h ompi/mca/pml/dr/pml_dr_component.h ompi/mca/pml/dr/pml_dr_endpoint.h ompi/mca/pml/dr/pml_dr_recvfrag.h ompi/mca/pml/example/pml_example.h ompi/mca/pml/ob1/pml_ob1_comm.h ompi/mca/pml/ob1/pml_ob1_component.h ompi/mca/pml/ob1/pml_ob1_endpoint.h ompi/mca/pml/ob1/pml_ob1_rdmafrag.h ompi/mca/pml/ob1/pml_ob1_recvfrag.h ompi/mca/pml/v/pml_v_output.h opal/include/opal/prefetch.h opal/mca/timer/aix/timer_aix.h opal/util/qsort.h test/support/components.h This commit was SVN r21855. The following SVN revision numbers were found above: r2 --> open-mpi/ompi@58fdc18855	2009-08-20 11:42:18 +00:00
Avneesh Pant	261d34db3a	Endpoint options port and outsl only appear post version 0x0107 so conditionally compile them in. This commit was SVN r21812.	2009-08-12 19:59:15 +00:00
Ralph Castain	0c73aa6a97	Fix a couple of errors that are preventing this module from building in MTT. NOTE: there are still two errors that I cannot fix - will send those to devel list This commit was SVN r21809.	2009-08-12 13:18:04 +00:00
Rainer Keller	1bd94f2d98	- When calling ompi_mtl_portals_finalize, when then pml/ob1 is used (aka w/o --mca pml cm), make sure PtlEQGet will actually work on ompi_mtl_portals.ptl_eq_h -- do so without adding code to ompi_mtl_portals_progress. Otherwise we abort() with [nid09979:32503] ompi_mtl_portals_finalize: Going to call ompi_mtl_portals_progress [nid09979:32503] Error returned from PtlEQGet. Error code - 14 [nid09979:32502] Signal: Aborted (6) [nid09979:32502] Signal code: (-6) This commit was SVN r21761.	2009-08-04 22:48:07 +00:00
Avneesh Pant	af09e7678c	Convert a few opal_output() calls to instead use orte_show_help() as well as do some minor cosmetic changes dealing with tab spacing and c-blocks being enclosed with \{\}. There was also a long standing bug with the PSM mtl if the number of hardware contexts on adapter were less than the number of cores on a node (The default case is they are the same hence no issues were reported). For completeness we take care of this case as well but it requires us to tell PSM how many local processes are running on a node and the local rank of the process on a node so it can allocate the available hardware contexts appropriately. This commit was SVN r21745.	2009-07-30 02:55:20 +00:00
Avneesh Pant	38e48d4e2f	Add support for MCA parameters for PSM MTL to specify IB unit, port, IB service level and PSM debug level to use. Also specify in the openib btl params file that QLogic hardware supports a max inlined messages size of 0 only. This commit was SVN r21734.	2009-07-24 20:09:39 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Rainer Keller	1ef32928fd	- get MTL up and running: Nothing notable, except mtl_base_datatype.h -- Undo change from r21096: Yes, we should not include datatype_internal.h, but we did and we have to: we derefence desc, and get an incomplete type, otherwise. This commit was SVN r21103. The following SVN revision numbers were found above: r21096 --> open-mpi/ompi@221fb9dbca	2009-04-29 08:04:16 +00:00
Rainer Keller	221fb9dbca	... Delayed due to notifier commits earlier this day ... - Delete unnecessary header files using contrib/check_unnecessary_headers.sh after applying patches, that include headers, being "lost" due to inclusion in one of the now deleted headers... In total 817 files are touched. In ompi/mpi/c/ header files are moved up into the actual c-file, where necessary (these are the only additional #include), otherwise it is only deletions of #include (apart from the above additions required due to notifier...) - To get different MCAs (OpenIB, TM, ALPS), an earlier version was successfully compiled (yesterday) on: Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled This commit was SVN r21096.	2009-04-29 01:32:14 +00:00
Rainer Keller	9dea63d63a	- Last of intrusive commits (promised)... err for now. Anyway, this is blocking the move: do not include pml.h if not really needed, aka none of the following used: mca_pml MCA_PML_CALL OMPI_ANY_TAG OMPI_ANY_SOURCE OMPI_PROC_NULL - Notable exceptions (deleting in one header->adding): - ompi/mca/mtl/psm/ - ompi/mca/osc/rdma/ - ompi/mca/btl/openib/btl_openib_endpoint.c depended on pml_base_sendreq.h - Tested on Linux/x86-64, this time including make check (thanks Jeff and Ralph) This commit was SVN r20725.	2009-03-04 17:06:51 +00:00
Rainer Keller	fd28b392bf	- An intrusive commit yet again (sorry): with the separation we get bitten by header depending on having already included the corresponding [opal\|orte\|ompi]_config.h header. When separating, things like [OPAL\|ORTE\|OMPI]_DECLSPEC are missed. Script to add the corresponding header in front of all following (taking care of possible #ifdef HAVE_...) - Including some minor cleanups to - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for orte/util/output.h - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h - ompi/mca/btl/btl.h -- reorder to fit - ompi/mca/bml/bml.h -- reorder to fit - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit - ompi/request/request.h -- additionally need ompi/constants.h - Tested on linux/x86-64 This commit was SVN r20720.	2009-03-04 15:35:54 +00:00
Rainer Keller	4c0e8e1e69	- Header orte/mca/oob/base/base.h is probably the wrong one to include anyhow -- if oob functionality is neededm then orte/mca/oob/oob.h Nevertheless compiles fine with -Wimplicit-function-declaration This commit was SVN r20641.	2009-02-26 04:20:03 +00:00
Rainer Keller	04567d3af0	- Header orte/mca/errmgr/errmgr.h is not needed. Once again compiles fine with -Wimplicit-function-declaration This commit was SVN r20640.	2009-02-26 04:05:30 +00:00
Rainer Keller	96e1b9b747	- Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml or ORTE_RML. As the others compiles fine with -Wimplicit-function-declaration This commit was SVN r20639.	2009-02-26 03:52:31 +00:00
Rainer Keller	d81443cc5a	- On the way to get the BTLs split out and lessen dependency on orte: Often, orte/util/show_help.h is included, although no functionality is required -- instead, most often opal_output.h, or orte/mca/rml/rml_types.h Please see orte_show_help_replacement.sh commited next. - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration actually showed two missing #include "orte/util/show_help.h" in orte/mca/odls/base/odls_base_default_fns.c and in orte/tools/orte-top/orte-top.c Manually added these. Let's have MTT the last word. This commit was SVN r20557.	2009-02-14 02:26:12 +00:00
Brian Barrett	76f47aaa1c	Fix a bunch of compiler warnings. Refs trac:1458 This commit was SVN r19649. The following Trac tickets were found above: Ticket 1458 --> https://svn.open-mpi.org/trac/ompi/ticket/1458	2008-09-26 16:15:05 +00:00
Brian Barrett	a68e0cff38	* Handle buffered and "complete" sends properly. Since we can always guarantee local progress, my opinion is we don't need to do anything special for complete sends. Perhaps I'm wrong on that one, in which case this implementation of send_complete is not quite right. * Give a slightly better error message when a catchall is hit, so that we can at least know which catchall is hit. This commit was SVN r19456.	2008-08-31 18:05:00 +00:00
Jeff Squyres	0af7ac53f2	Fixes trac:1392, #1400 * add "register" function to mca_base_component_t * converted coll:basic and paffinity:linux and paffinity:solaris to use this function * we'll convert the rest over time (I'll file a ticket once all this is committed) * add 32 bytes of "reserved" space to the end of mca_base_component_t and mca_base_component_data_2_0_0_t to make future upgrades [slightly] easier * new mca_base_component_t size: 196 bytes * new mca_base_component_data_2_0_0_t size: 36 bytes * MCA base version bumped to v2.0 * '''We now refuse to load components that are not MCA v2.0.x''' * all MCA frameworks versions bumped to v2.0 * be a little more explicit about version numbers in the MCA base * add big comment in mca.h about versioning philosophy This commit was SVN r19073. The following Trac tickets were found above: Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392	2008-07-28 22:40:57 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Ralph Castain	c992e99035	Remove the tags from orte_output_open and the filtering operation from orte_output - this will be handled differently to improve the XML output interface This commit was SVN r18557.	2008-06-03 14:24:01 +00:00
George Bosilca	e361bcb64c	Send optimizations. 1. The send path get shorter. The BTL is allowed to return > 0 to specify that the descriptor was pushed to the networks, and that the memory attached to it is available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag can be used by the PML to force the BTL to always trigger the callback. Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS which force the PML to have exactly the same behavior as before. Some BTLs have been modified: self, sm, tcp, mx. 2. Add send immediate interface to BTL. The idea is to have a mechanism of allowing the BTL to take advantage of send optimizations such as the ability to deliver data "inline". Some network APIs such as Portals allow data to be sent using a "thin" event without packing data into a memory descriptor. This interface change allows the BTL to use such capabilities and allows for other optimizations in the future. All existing BTLs except for Portals and sm have this interface set to NULL. This commit was SVN r18551.	2008-05-30 03:58:39 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Galen Shipman	92e3b8671f	nasty memory bug... This commit was SVN r18207.	2008-04-18 03:01:53 +00:00
Christian Bell	987de57c9c	Looks like orte/ns is now gone This commit was SVN r17706.	2008-03-05 00:55:43 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Galen Shipman	b378c8c12c	return success. This commit was SVN r17612.	2008-02-27 02:15:53 +00:00
Galen Shipman	44003a41f2	Update common_portals to allow using portals interconnect with a modex rather than relying on cnos to get the nid/pid map. This commit was SVN r17588.	2008-02-25 19:17:21 +00:00
Ron Brightwell	b02cad2a0b	added optional rendezvous protocol for long messages This commit was SVN r17124.	2008-01-11 22:12:45 +00:00
Jeff Squyres	213b5d5c6e	Per long threads on the mailing list and much confusion discussion about linkers, have all OPAL, ORTE, and OMPI components '''not'' link against the OPAL, ORTE, or OMPI libraries. See ttp://www.open-mpi.org/community/lists/users/2007/10/4220.php for details (or https://svn.open-mpi.org/trac/ompi/wiki/Linkers for a better-formatted version of the same info). This commit was SVN r16968.	2007-12-15 13:32:02 +00:00
Galen Shipman	4daa552c97	Correct makefile to include all sources, should fix a problem in building a distro.. This commit was SVN r16894.	2007-12-07 18:59:16 +00:00
Ron Brightwell	0138a2ee17	Do cleanup in ompi_mtl_portals_del_procs() rather than ompi_mtl_portals_finalize(). Previous code was cleaning up Portals resources that hadn't been allocated, which caused valid handles used elsewhere to be freed, which broke cnos_barrier() for the Portals btl. This commit was SVN r16801.	2007-11-29 17:29:46 +00:00
Ron Brightwell	a6d6be1bb9	Added send-side optimizations (persistent zero-length md and copy blocks) and support for Acclerated Portals. This commit was SVN r16770.	2007-11-21 21:31:37 +00:00
Rich Graham	27a748e7eb	change all instances of ompi_free_list_init to ompi_free_list_init_new. Header and payload data are specified separately at this stage. This commit was SVN r16633.	2007-11-01 23:38:50 +00:00
George Bosilca	95c9fbdf45	Make sure the MX MTL component is shared between all files. This commit was SVN r16545.	2007-10-22 18:06:52 +00:00
Rich Graham	0de9bd9fa0	when attaching an md for posted receive, generate a start event, so that PtlMDUpdate will pick up all incoming events. This commit was SVN r16517.	2007-10-19 19:09:40 +00:00
Brian Barrett	69952d9603	Fix abort caused by calling PtlEQGet on an invalid eq, which could occur if add_procs was never called. This commit was SVN r15779.	2007-08-06 17:28:11 +00:00
Christian Bell	5ae68f82b2	fix gcc 3.x compilation warnings This commit was SVN r15327.	2007-07-10 13:54:34 +00:00
Brian Barrett	8b9e8054fd	Move modex from pml base to general ompi runtime, sicne it's used by more than just the PML/BTLs these days. Also clean up the code so that it handles the situation where not all nodes register information for a given node (rather than just spinning until that node sends information, like we do today). Includes r15234 and r15265 from the /tmp/bwb-modex branch. This commit was SVN r15310. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r15234 r15265	2007-07-09 17:16:34 +00:00
Josh Hursey	acae12d0bb	Fix warning: stderr -> fileno(stderr) This commit was SVN r15207.	2007-06-26 19:28:40 +00:00
Josh Hursey	5199f4123d	Add 2 new MCA parameters to set the size of the expected and unexpected queues. This commit was SVN r15206.	2007-06-26 17:31:43 +00:00
Rich Graham	aa2ffcfcd8	add some output before abort() is called. This commit was SVN r15204.	2007-06-26 15:57:47 +00:00
Galen Shipman	8e7cce813e	don't update MPI_ERROR This commit was SVN r15004.	2007-06-11 21:40:29 +00:00
Galen Shipman	406b05bdc3	update copyright.. This commit was SVN r15003.	2007-06-11 21:17:49 +00:00
Galen Shipman	798cc2c5b8	handle MPI_STATUS_IGNORE in iprobe for the MTLs This commit was SVN r15002.	2007-06-11 20:19:31 +00:00
George Bosilca	5d6c958066	Enable the MTLs to be compiled in a visibility featured environment. This commit was SVN r14955.	2007-06-07 20:14:53 +00:00
Jeff Squyres	51f286d737	Just like r14289 on the ORTE trunk: Per discussions with Brian and Ralph, make a slight correction in where components are installed. Use $pkglibdir, not $libdir/openmpi, so that when compiled in the orte trunk, components are installed to the right directory (because the component search patch is checking $pkglibdir). This commit was SVN r14345. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r14289	2007-04-12 11:19:42 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Tim Mattox	ec82d01555	Add a missing extern keyword that prevented compilation on OS X. This commit was SVN r13853.	2007-02-28 20:26:34 +00:00
Sven Stork	870740efe2	- proper export symbols that are required by other components. This commit was SVN r13841.	2007-02-28 12:51:55 +00:00
Ron Brightwell	e15e85a0b6	Fix a problem with long unexpected messages that was causing hangs. Long unexpected messages were not generating PUT_START events because the MD for long unexpected messages was configured to ignore start events. When a long unexpected message arrived, it traversed the match list, and ended up in the long unexpected MD. As the long message is being consumed, the code called PtlMDUpdate() to look for the message, but there was no event that indicated that it had arrived. So, the update succeeded. Once the long unexpected message was consumed, the PUT_END event showed up in the event queue -- except the code wasn't looking for it anymore. The PUT_START events exist specifically to handle ordering between short and long unexpected messages, so PUT_START events can't be ignored on long unexpected messages. Modified the code to generate PUT_START events for both long and short unexpected messages and handle matching up START and END events appropriately. This commit was SVN r13746.	2007-02-21 21:59:48 +00:00
Rich Graham	b925d6588d	add some missing error checking - thanks to Ron B. This commit was SVN r13692.	2007-02-16 22:19:24 +00:00
Galen Shipman	f98a442c82	Fix a problem in the selection logic for MX. Basically we need to be able to open MTL MX and BTL MX and initialize them at the same time. The problem is that both call mx_init and mx_finalize, solution is to add an external entity that does the init and finalize (based on ref counting). This commit was SVN r13576.	2007-02-09 03:19:38 +00:00
Brian Barrett	09cc9e4941	properly compute starting offset -- the lb will be included in the offset, so we don't need both. Refs trac:864 This commit was SVN r13494. The following Trac tickets were found above: Ticket 864 --> https://svn.open-mpi.org/trac/ompi/ticket/864	2007-02-05 18:12:18 +00:00
Galen Shipman	a94101fa62	mostly another hack around for PML selection, allows CM be select itself if an MTL is available, if not OB1 is used. Still prevents DR and OB1 from stomping on each other though. This commit was SVN r13481.	2007-02-03 02:01:18 +00:00
Christian Bell	e04c55af00	Fixes to psm mtl following a more comprehensive testing of intel tests. This commit was SVN r13471.	2007-02-02 21:55:04 +00:00
Brian Barrett	039a3d8c17	add comment about why there's no status update here, since I always forget This commit was SVN r13400.	2007-01-31 21:39:20 +00:00
Brian Barrett	846eed84f1	When receiving a message, need to account for the fact that the displacement of the first entry might not be the start of the user's buffer. This is similar to what ompi_convertor_unpack does. This is the solution for the test case attached to ticket #690. Refs trac:690 This commit was SVN r13397. The following Trac tickets were found above: Ticket 690 --> https://svn.open-mpi.org/trac/ompi/ticket/690	2007-01-31 18:18:19 +00:00
Brian Barrett	65b07140c0	clean up some of the printf warnings caused by the attribute code This commit was SVN r13395.	2007-01-31 17:11:06 +00:00
Patrick Geoffray	b252cb82c8	oops, ".", not "->", copy error... This commit was SVN r13287.	2007-01-24 19:16:46 +00:00
Patrick Geoffray	d58f6b2451	Free memory in synchronous send case if free_after requires it. Fixes memory leak using synchronous sends and custom data types. This commit was SVN r13286.	2007-01-24 19:10:38 +00:00
Brian Barrett	b8413fb1d5	Just cast the pointer to a uintptr_t then to the match bits, instead of abusing the ompi_ptr_t interface. Not critical for v1.2, as there are no portals platforms that are big endian, so the code in v1.2 will work well enough for now This commit was SVN r13024.	2007-01-07 03:11:27 +00:00
Brian Barrett	48ec0b2071	Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix for now... This commit was SVN r12997. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c	2007-01-04 22:07:37 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Brian Barrett	6f8b366acb	Rename liborte to libopen-rte and libopal to libopen-pal per telecon today and bug #632. Refs trac:632 This commit was SVN r12762. The following Trac tickets were found above: Ticket 632 --> https://svn.open-mpi.org/trac/ompi/ticket/632	2006-12-05 18:27:24 +00:00
Brian Barrett	bfbc281e93	Fix slow startup issue with the MX MTL. The problem is caused by mx_connect() being a one-sided operation from the API level, but not being an interrupting call when the target is not entering the MX library. So if most of the processes exit mtl_mx_add_procs() and enter the stage gate 2 barrier, the other processes can only progress their mx_connect() calls when the targets enter the mx library. Because the event library is in EV_ONELOOP mode, this only happens once a second. The mx progress thread (hidden in the MX library) also only wakes up once a second, so mx_connect calls can take a second to complete. The temporary solution is to switch into EV_NONBLOCK mode earlier (right after the mx_connect loop) so that there isn't a giant slowdown when processes enter the stage gate 2 barrier before other proesses. They will now not block in the event library for any period of time, which appears to have a 50% speedup when running at > 64 procs. Refs trac:645 This commit was SVN r12713. The following Trac tickets were found above: Ticket 645 --> https://svn.open-mpi.org/trac/ompi/ticket/645	2006-12-01 02:49:01 +00:00
Brian Barrett	993d2a7753	Fix for issue IU is seeing on BigRed with connections timing out during MPI_INIT. Use an infinite timeout, which is exactly what MPICH-MX does. This commit was SVN r12669.	2006-11-27 20:10:27 +00:00
George Bosilca	bfbd0e61f6	Minimize the number of lines of code :) This commit was SVN r12550.	2006-11-10 20:56:08 +00:00
George Bosilca	63462331c9	Reduce the number of branches. Keep the fast path as short as possible. Remove some useless error checking. Add OPAL_UNLIKELY directives. This commit was SVN r12477.	2006-11-07 23:59:32 +00:00
George Bosilca	f3de2e1a82	Keep the fast path as short as possible. This commit was SVN r12476.	2006-11-07 23:56:32 +00:00
George Bosilca	108ea4dbe9	When the MX MTL complete a request, force a return from the progress function. Decrease the latency by about 0.3 microseconds. This commit was SVN r12454.	2006-11-06 23:13:07 +00:00
George Bosilca	110d07b7d3	Small optimization or zero length messages. This commit was SVN r12414.	2006-11-02 19:10:28 +00:00
George Bosilca	dbec514b0f	Optimize the generation of the match_bits and the mask. This commit was SVN r12396.	2006-11-01 23:19:20 +00:00
George Bosilca	882b429f64	ompi_mtl_datatype_pack is not a data-type function (really) so it still need the free_after (which btw has a different meaning that the one removed from the data-type engine few minutes ago). This commit was SVN r12333.	2006-10-27 00:15:53 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	18d119bc06	No more warnings. This commit was SVN r12181.	2006-10-18 21:10:11 +00:00
George Bosilca	179067dfb5	Correct a type that break the PSM build. This commit was SVN r12069.	2006-10-09 23:14:22 +00:00
Brian Barrett	c3306f7073	* don't abort if we get a status error - just pass it on to the next layer up This commit was SVN r11791.	2006-09-25 17:28:24 +00:00
George Bosilca	688a16ea78	A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was long ago) supposed to be used as a cache for accessing the PML procs. But in all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc. This pointer can be accessed using the c_remote_group easily. Therefore, there is no meaning of keeping the PML procs around. Slim fast commit ... This commit was SVN r11730.	2006-09-20 22:14:46 +00:00
Jeff Squyres	d30b1ad61a	Remove a file that should not be in SVN This commit was SVN r11667.	2006-09-15 02:37:34 +00:00
Ralph Castain	37dfdb76eb	Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done. This commit was SVN r11661.	2006-09-14 21:29:51 +00:00
Galen Shipman	877b819ddb	Initial commit of QLogic PSM MTL. This provides support for the Infinipath interconnect using the PSM API. Of note: This version has a "hackaround" we always return 1 or greater from the MTL PSM progress function, this should be examined further. This commit was SVN r11655.	2006-09-14 16:44:02 +00:00
Galen Shipman	0be42b22b9	Fix connection timeout error handling.. This commit was SVN r11651.	2006-09-14 14:02:14 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
Brian Barrett	f1bfd174da	* need to set SUCCESS when completing a request This commit was SVN r11255.	2006-08-17 20:03:10 +00:00
Brian Barrett	292068b34b	* check return status of module init This commit was SVN r11235.	2006-08-16 21:27:57 +00:00
Brian Barrett	6d414f2d44	* use the MTL-specific output stream for all error messages * use OPAL_OUTPUT_VERBOSE rather than printfs for debugging messages This commit was SVN r11227.	2006-08-16 16:28:58 +00:00
Brian Barrett	dc74a6a8e1	* implement iprobe for the MX MTL This commit was SVN r11211.	2006-08-15 22:16:50 +00:00
Brian Barrett	0d218c6bdc	* implement cancel for MX This commit was SVN r11209.	2006-08-15 21:59:37 +00:00
Brian Barrett	a21769bbfb	* careful with the opal_output when no components are selected This commit was SVN r11093.	2006-08-02 21:13:33 +00:00
Brian Barrett	bc16f462b9	* print framework and component name during load errors * return a failure from mtl select code if we don't have a component that can run This commit was SVN r11092.	2006-08-02 20:59:58 +00:00
Brian Barrett	dfa1221c3b	* AC_CONFIG_LINKS has a minor problem in that it always uses ln -s, rather than $(LN_S). This causes problems with with Windows and probably elsewhere (re: #200). So use a slightly different trick to get the right header selected for the MEMCPY and TIMER components. * Using the same trick used to solve the AC_CONFIG_LINKS problem, stop using a separate header file for direct calling in the PML and MTL. This lets me remove some icky code in ompi_mca.m4 that was more fragile than I really liked. This commit was SVN r10841.	2006-07-16 04:23:52 +00:00
Brian Barrett	d3c6035ea9	* allow direct calling to work with the MX MTL. Had to move some types around so that the myriexpress.h header wasn't included in the same header as the interface declarations This commit was SVN r10817.	2006-07-14 21:32:03 +00:00
Brian Barrett	3b978e3985	* implement short unexpected message copy optimization This commit was SVN r10813.	2006-07-14 19:50:27 +00:00
Galen Shipman	6ed255f114	Substantial changes to the CM PML, allows us to have a very thin request for all but buffered and persistent requests. Unfortunately we were note able to reuse the pml_base_request_t as it was just too heavy for our needs. Lots of code for 2/10 usec ;-) This commit was SVN r10810.	2006-07-14 19:32:26 +00:00
Brian Barrett	ca5bd805db	* add missing continuation line This commit was SVN r10758.	2006-07-12 14:33:08 +00:00
George Bosilca	40f7d054f2	No more unused variables ... This commit was SVN r10735.	2006-07-11 15:24:57 +00:00
Galen Shipman	9a1221bf7d	fix buffered sends (don't use blocking sends!) removed inaccurate comment.. This commit was SVN r10703.	2006-07-10 16:11:14 +00:00
Galen Shipman	5085061475	don't call unpack when we received directly into the user buffer.. the convertor doesn't handle it properly continue peeking until we don't get anything else.. close the endpoint before closing the library.. add a blocking send that uses mx_test .. This commit was SVN r10684.	2006-07-06 19:54:13 +00:00
Brian Barrett	cba9b1e6b7	* the POrtals MTL is now stable enough to not have it ompi ignored This commit was SVN r10682.	2006-07-06 18:26:48 +00:00
Brian Barrett	ef6b7e170f	* make mtl datatype wrapper code inline functions This commit was SVN r10678.	2006-07-06 15:58:07 +00:00
Galen Shipman	2217fd4003	reset receive request convertor for persistent requests We can always call unpack.. This commit was SVN r10677.	2006-07-06 15:13:26 +00:00
Brian Barrett	ef8c6a249b	* Fix up some direct-calling issues for the PML/MTL This commit was SVN r10676.	2006-07-06 15:12:38 +00:00
Brian Barrett	95118f83f6	* complete all outstanding Portals events before shutting down * Remove all knowledge of PML requests from the Portals MTL This commit was SVN r10675.	2006-07-06 14:33:29 +00:00
Brian Barrett	c793ad0a3d	unpack the amount received, not the amount we had space to receive. This commit was SVN r10669.	2006-07-05 22:31:29 +00:00
Galen Shipman	c933c0f65f	unpack the length actually received, not the length posted.. This commit was SVN r10668.	2006-07-05 22:16:46 +00:00
Brian Barrett	3e29949cc8	* Fix shutdown code in utcp portals code * make all sends long sends for now in Portals MTL * More optimized match check This commit was SVN r10667.	2006-07-05 21:46:45 +00:00
Galen Shipman	fe480cd003	change mask bits and don't call convertor if we received directly into the user buffer.. This commit was SVN r10665.	2006-07-05 21:10:09 +00:00
Brian Barrett	043153dad3	* fix opal_list_item_t -> ompi_free_list_item_t type change This commit was SVN r10659.	2006-07-05 17:02:16 +00:00
George Bosilca	d2bf3844e9	Include the header file which define opal_output. This commit was SVN r10648.	2006-07-04 06:23:01 +00:00
George Bosilca	402a03d229	Add a .h dependency in order to remove a warning when we compile without --enable-debug. This commit was SVN r10646.	2006-07-04 04:53:38 +00:00
Brian Barrett	47725c9b02	* Add new PML (CM) and network drivers (MTL) for high speed interconnects that provide matching logic in the library. Currently includes support for MX and some support for Portals * Fix overuse of proc_pml pointer on the ompi_proc structuer, splitting into proc_pml for pml data and proc_bml for the BML endpoint data * bug fixes in bsend init code, which wasn't being used by the OB1 or DR PMLs... This commit was SVN r10642.	2006-07-04 01:20:20 +00:00

... 3 4 5 6 7 ...

421 Коммитов