openmpi

Автор	SHA1	Сообщение	Дата
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
George Bosilca	64075cd54d	Get rid of the bitmap header file. This commit was SVN r20972.	2009-04-10 16:44:37 +00:00
Donald Kerr	ef55aae401	fix #1829 : udapl btl support for relaxed ordering This commit was SVN r20772.	2009-03-13 01:01:00 +00:00
Rainer Keller	9dea63d63a	- Last of intrusive commits (promised)... err for now. Anyway, this is blocking the move: do not include pml.h if not really needed, aka none of the following used: mca_pml MCA_PML_CALL OMPI_ANY_TAG OMPI_ANY_SOURCE OMPI_PROC_NULL - Notable exceptions (deleting in one header->adding): - ompi/mca/mtl/psm/ - ompi/mca/osc/rdma/ - ompi/mca/btl/openib/btl_openib_endpoint.c depended on pml_base_sendreq.h - Tested on Linux/x86-64, this time including make check (thanks Jeff and Ralph) This commit was SVN r20725.	2009-03-04 17:06:51 +00:00
Rainer Keller	fd28b392bf	- An intrusive commit yet again (sorry): with the separation we get bitten by header depending on having already included the corresponding [opal\|orte\|ompi]_config.h header. When separating, things like [OPAL\|ORTE\|OMPI]_DECLSPEC are missed. Script to add the corresponding header in front of all following (taking care of possible #ifdef HAVE_...) - Including some minor cleanups to - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for orte/util/output.h - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h - ompi/mca/btl/btl.h -- reorder to fit - ompi/mca/bml/bml.h -- reorder to fit - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit - ompi/request/request.h -- additionally need ompi/constants.h - Tested on linux/x86-64 This commit was SVN r20720.	2009-03-04 15:35:54 +00:00
Rainer Keller	811f2bd9b4	- As discussed on RFC, move the ompi_bitmap to the opal layer. Add a check against a maximum (actually get rid of ifs internally to opal_bitmap.c) -- the functionality to set the current maximum size opal_bitmap_set_max_size() is currently only used in attribute.c to set the maximum OMPI_FORTRAN_HANDLE_MAX... Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f run with 6 procs. Let's look into MTT as well... This commit was SVN r20708.	2009-03-03 22:25:13 +00:00
Rainer Keller	32b7189995	- Make usage of BTL_OUTPUT This commit was SVN r20606.	2009-02-20 03:05:14 +00:00
Donald Kerr	e57435a5d4	udapl btl fix for #1725 ; replace WAIT with GET This commit was SVN r20227.	2009-01-08 13:41:36 +00:00
Donald Kerr	213daa58da	support for solaris relaxed ordering This commit was SVN r20167.	2008-12-24 15:05:12 +00:00
Jeff Squyres	0af7ac53f2	Fixes trac:1392, #1400 * add "register" function to mca_base_component_t * converted coll:basic and paffinity:linux and paffinity:solaris to use this function * we'll convert the rest over time (I'll file a ticket once all this is committed) * add 32 bytes of "reserved" space to the end of mca_base_component_t and mca_base_component_data_2_0_0_t to make future upgrades [slightly] easier * new mca_base_component_t size: 196 bytes * new mca_base_component_data_2_0_0_t size: 36 bytes * MCA base version bumped to v2.0 * '''We now refuse to load components that are not MCA v2.0.x''' * all MCA frameworks versions bumped to v2.0 * be a little more explicit about version numbers in the MCA base * add big comment in mca.h about versioning philosophy This commit was SVN r19073. The following Trac tickets were found above: Ticket 1392 --> https://svn.open-mpi.org/trac/ompi/ticket/1392	2008-07-28 22:40:57 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Ralph Castain	dc7f45dafd	Remove the obsolete and largely unused orte_system_info structure. The only fields that were used in that struct were nodeid and nodename - these have been transferred to the orte_process_info structure. Only one place used the user name field - session_dir, when formulating the name of the top-level directory. Accordingly, the code for getting the user's id has been moved to the session_dir code. This commit was SVN r17926.	2008-03-23 23:10:15 +00:00
Ralph Castain	d70e2e8c2b	Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. Remains to be tested to ensure everything came over cleanly, so please continue to withhold commits a little longer This commit was SVN r17632.	2008-02-28 01:57:57 +00:00
Donald Kerr	5f884b1ca4	fix for #1130 - adds support for multi-rail configurations This commit was SVN r17152.	2008-01-17 17:30:50 +00:00
George Bosilca	6310ce955c	The first patch related to the Active Message stuff. So far, here is what we have: - the registration array is now global instead of one by BTL. - each framework have to declare the entries in the registration array reserved. Then it have to define the internal way of sharing (or not) these entries between all components. As an example, the PML will not share as there is only one active PML at any moment, while the BTLs will have to. The tag is 8 bits long, the first 3 are reserved for the framework while the remaining 5 are use internally by each framework. - The registration function is optional. If a BTL do not provide such function, nothing happens. However, in the case where such function is provided in the BTL structure, it will be called by the BML, when a tag is registered. Now, it's time for the second step... Converting OB1 from a switch based PML to an active message one. This commit was SVN r17140.	2008-01-15 05:32:53 +00:00
George Bosilca	906e8bf1d1	Replace the ompi_pointer_array with opal_pointer_array. The next step (sometimes after the merge with the ORTE branch), the opal_pointer_array will became the only pointer_array implementation (the orte_pointer_array will be removed). This commit was SVN r17007.	2007-12-21 06:02:00 +00:00
Donald Kerr	d05d3afaed	clean up and make consistent the reporting out from the udapl btl; report out readeable event string instead of just a number This commit was SVN r16954.	2007-12-13 15:32:26 +00:00
Gleb Natapov	e2e211f23b	Add flags parameter to btl_alloc() and btl_prepare_src() functions. If BTL knows at the time of allocation priority of a descriptor it may do some optimizations. This commit was SVN r16901.	2007-12-09 14:08:01 +00:00
Gleb Natapov	7364b7cf47	Add endpoint parameter to btl_alloc() function. Enables various optimizations inside BTL. This commit was SVN r16898.	2007-12-09 14:00:42 +00:00
Donald Kerr	2df5576d1d	add support for if_include/if_exclude mca parameter to allow selection of udapl registry interface adapters; reviewed by rolf van de vaart This commit was SVN r15565.	2007-07-23 19:49:34 +00:00
Donald Kerr	8ecbc71ed2	add support for connection private data, off by default This commit was SVN r14878.	2007-06-05 19:29:50 +00:00
Galen Shipman	3401bd2b07	Add optional ordering to the BTL interface. This is required to tighten up the BTL semantics. Ordering is not guaranteed, but, if the BTL returns a order tag in a descriptor (other than MCA_BTL_NO_ORDER) then we may request another descriptor that will obey ordering w.r.t. to the other descriptor. This will allow sane behavior for RDMA networks, where local completion of an RDMA operation on the active side does not imply remote completion on the passive side. If we send a FIN message after local completion and the FIN is not ordered w.r.t. the RDMA operation then badness may occur as the passive side may now try to deregister the memory and the RDMA operation may still be pending on the passive side. Note that this has no impact on networks that don't suffer from this limitation as the ORDER tag can simply always be specified as MCA_BTL_NO_ORDER. This commit was SVN r14768.	2007-05-24 19:51:26 +00:00
Donald Kerr	588d5bd6a9	clean up compile warnings This commit was SVN r14691.	2007-05-17 23:37:47 +00:00
Donald Kerr	2ed72bf2e2	break evd_qlen into individual qlens (async,dto,conn); add checks based on udapl limits and number of peers This commit was SVN r14659.	2007-05-15 17:47:00 +00:00
Donald Kerr	436d370d51	latency improvements: use ompi_free_list_init_ex, create optimal alignment parameter, remove rdma guarantee path, replace dat_lmt_sync_rdma with use of volatile This commit was SVN r14634.	2007-05-09 19:41:25 +00:00
Donald Kerr	80d984441f	change so that we only check connection queue when expecting a connection; create a mca parameter that controls frequency at which the async queue is checked This commit was SVN r14511.	2007-04-25 17:46:25 +00:00
Donald Kerr	cae24fcde1	move mca parameter registration into own .c and .h files This commit was SVN r14493.	2007-04-24 18:34:16 +00:00
Josh Hursey	dadca7da88	Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD). This merge adds Checkpoint/Restart support to Open MPI. The initial frameworks and components support a LAM/MPI-like implementation. This commit follows the risk assessment presented to the Open MPI core development group on Feb. 22, 2007. This commit closes trac:158 More details to follow. This commit was SVN r14051. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r13912 The following Trac tickets were found above: Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158	2007-03-16 23:11:45 +00:00
Jeff Squyres	86f8c66a27	Turns out that the leave_pinned stuff isn't used in these BTLs at all. So just remove it. This commit was SVN r13360.	2007-01-30 15:39:49 +00:00
Donald Kerr	ed097d17c1	fix for bug #749 , though I can not confirm without a linux compiler This commit was SVN r13090.	2007-01-11 22:25:13 +00:00
Donald Kerr	80f2cbb498	add udapl rdma capabilities into the udapl btl This commit was SVN r13082.	2007-01-11 15:22:08 +00:00
Gleb Natapov	190e7a27cd	Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma). udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited). This commit was SVN r12878.	2006-12-17 12:26:41 +00:00
George Bosilca	a3ad4a7fc8	The visibility flags (and/or Windows friendly export) is now on for all BTLs. This commit was SVN r11662.	2006-09-14 22:19:39 +00:00
Galen Shipman	e5c594c211	More updates for the async error handler for btl's In order to provide backwards compatability the framework versions are bumped and the handler registeration function is at the end of the btl struct. Testing done on sm, openib, and gm.. This commit was SVN r11256.	2006-08-17 22:02:01 +00:00
Donald Kerr	2e5e01a8df	Remove dependency on known port range and allow udapl to provide the port number. This commit was SVN r11040.	2006-07-28 13:58:21 +00:00
Andrew Friedley	c68c6ac122	A number of fixes and the usual cleanup.. - Added some basic flow control to limit number of posted sends. - Merged endpoint send/recv lock into single endpoint lock. - Set the LMR triplet length in the send path, not at allocation time. This has to be done because upper layers might send less than the amount allocated. - Alter the tie-breaker if statement protecting the second call to dat_ep_connect(). The logic was reversed compared to the tie- breaker for the first dat_ep_connect(), making it possible for 3 or more processes to form a deadlock loop. - Some asserts were added for debugging purposes.. leaving them in place for now. This commit was SVN r10317.	2006-06-12 22:42:01 +00:00
Andrew Friedley	74b2f77a4c	The expected cleanup/refactoring commit.. Not much got tested that wasn't already - I've uncovered a connection establishment deadlock and wanted to get these changes committed before I attack it. The big changes: - Moved much of the connection code from btl_udapl_component.c to btl_udapl_endpoint.c. - Cleaned up initialization of various fragment members. - MCA_BTL_UDAPL_ERROR macro, which is compiled in/out appropriately. This commit was SVN r9496.	2006-03-31 16:25:19 +00:00
Andrew Friedley	0eba366b07	Various pieces all over to make basic small message send/recv work. Next step is clean up the code.. it is in need of refactoring and testing. Thanks to Brian for help in troubleshooting! This commit was SVN r9466.	2006-03-29 21:55:41 +00:00
Andrew Friedley	cf9246f7b9	Long overdue commit.. many changes. In short, I'm very close to having connection establishment and eager send/recv working. Part of the connection process involves sending address information from the client to server. For some reason, I am never receiving an event indicating completetion of the send on the client side. Otherwise, connection establishment is working and eager send/recv should be trivial from here. Some more detailed changes: - Send partially implemented, just handles starting up new connections. - Several support functions implemented for establishing connection. Client side code went in btl_udapl_endpoint.c, server side in btl_udapl_component.c - Frags list and send/recv locks added to the endpoint structure. - BTL sets up a public service point, which listens for new connections. Steps over ports that are already bound, iterating through a range of ports. - Remove any traces of recv frags, don't think I need them after all. - Pieces of component_progress() implemented for connection establishment. - Frags have two new types for connection establishment - CONN_SEND and CONN_RECV. - Many other minor cleanups not affecting functionality This commit was SVN r9345.	2006-03-21 00:12:55 +00:00
Brian Barrett	566a050c23	Next step in the project split, mainly source code re-arranging - move files out of toplevel include/ and etc/, moving it into the sub-projects - rather than including config headers with <project>/include, have them as <project> - require all headers to be included with a project prefix, with the exception of the config headers ({opal,orte,ompi}_config.h mpi.h, and mpif.h) This commit was SVN r8985.	2006-02-12 01:33:29 +00:00
Andrew Friedley	b37e18916f	Many different things, the big ones: - Start filling in the progress function, focusing on connection establishment. - Initialize udapl mpool and free lists - Create/destroy a protection zone with each IA - Misc organization as I learn how things work This commit was SVN r8969.	2006-02-10 21:49:15 +00:00
Andrew Friedley	5ccab7bcda	Checkpoint: - Move mca_btl_udapl_error/mca_btl_module_init to mca_btl_udapl.c and rename it - White space cleanups - Free the uDAPL evd and ia handles in mca_btl_udapl_finalize This commit was SVN r8705.	2006-01-16 21:54:50 +00:00
Andrew Friedley	a4abe3bdbe	Checkpoint: - Borrow configure.m4 from the mvapi btl. One of the uDAPL headers emits a warning when -pedantic is enabled, so strip it out. - Change function check in ompi_check_dapl.m4 from dat_ia_open to dat_registry_list_providers.. dat_ia_open wasn't working right - Make the references to prepare_dst, put, and get NULL for now - Add opal_output() calls in all the udapl interface functions for debugging - Add evd_qlen component parameter to control event dispatcher queue length - First stab at component_init and module_init - Misc cleanups - whitespace, dead code removal - Update copyrights to 2006 This commit was SVN r8701.	2006-01-16 03:01:12 +00:00
Andrew Friedley	c0bad339af	- Use the GM BTL as a template instead, per Tim's suggestion - Begin adding uDAPL-specific stuff - Added config/ompi_check_udapl.m4 - hopefully I did this right This commit was SVN r8681.	2006-01-12 04:05:02 +00:00
Andrew Friedley	f402854a96	Initial commit of uDAPL BTL component. - Copied the template BTL and renamed everything - Compiles and shows up correctly in ompi_info, not tested past that - Should be ignored for everyone but me This commit was SVN r8544.	2005-12-19 16:37:05 +00:00

46 Коммитов