openmpi

Автор	SHA1	Сообщение	Дата
Nathan Hjelm	fd42343ff0	osc/pt2pt: reduce memory footprint of window This commit updates osc/pt2pt to allocate peer object as they are needed rather than all at once. Additionally, to help improve the memory footprint a new synchronization structure has been added. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-09-16 13:01:56 -06:00
Gilles Gouaillardet	21b1e7f8c5	mpi conformance: fix prototypes - MPI_Compare_and_swap - MPI_Fetch_and_op - MPI_Raccumulate - MPI_Win_detach Thanks to Michael Knobloch and Takahiro Kawashima for bringing this to our attention	2015-08-31 10:34:05 +09:00
Gilles Gouaillardet	1488e82efd	osc/pt2pt: enable heterogeneous support	2015-05-14 16:42:48 +09:00
Nathan Hjelm	29b435a5a4	osc/pt2pt: fix bugs that caused incorrect fragment counting This commit fixes a bug identified by MTT that occurred when mixing passive and active target synchronization. The bugs fixed in this commit are: - Do not update incoming fragment counts for any type of unbuffered control message. These messages are out-of-band and should not be considered towards the signal counts. - Complete a change from using received counts to expected counts for lock, unlock, and flush acks. Part of the change made it into master before the rest was ready. This was preventing wakeups in some cases. - Turn the passive_target_access_epoch module member into a counter. As long as at least one peer is locked we are in a passive-target epoch and not an active target one. This fix will ensure that fragment flags are set appropriately. fixes #538 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-04-23 13:22:24 -06:00
Nathan Hjelm	e68ed2876c	osc/pt2pt: threading fixes and code cleanup	2015-01-06 13:39:16 -07:00
Nathan Hjelm	9eba7b9d35	Rename the OSC "rdma" component to pt2p to better reflect that it does not actually use btl rdma	2015-01-06 13:38:55 -07:00
Nathan Hjelm	1b564f62bd	Revert "Merge pull request #275 from hjelmn/btlmod" This reverts commit `ccaecf0fd6`, reversing changes made to `6a19bf85dd`.	2014-11-19 23:22:43 -07:00
Nathan Hjelm	22625b005b	osc/pt2pt: threading fixes and code cleanup	2014-11-19 11:33:04 -07:00
Nathan Hjelm	29e4e1c90a	Rename the OSC "rdma" component to pt2p to better reflect that it does not actually use btl rdma	2014-11-19 11:33:03 -07:00
Ralph Castain	49d938de29	Merge one-sided updates to the trunk - written by Brian Barrett and Nathan Hjelmn cmr=v1.7.5:reviewer=hjelmn:subject=Update one-sided to MPI-3 This commit was SVN r30816.	2014-02-25 17:36:43 +00:00
Ralph Castain	5d1fa4fa0e	Silence warnings: osc_pt2pt_data_move.c: In function 'ompi_osc_pt2pt_sendreq_recv_accum_long_cb': osc_pt2pt_data_move.c:643:9: warning: variable 'ret' set but not used [-Wunused-but-set-variable] osc_rdma_data_move.c: In function 'ompi_osc_rdma_control_send_cb': osc_rdma_data_move.c:1312:37: warning: variable 'header' set but not used [-Wunused-but-set-variable] This commit was SVN r29092.	2013-08-29 20:56:36 +00:00
Nathan Hjelm	9d4a26f47d	Update OMPI frameworks to use the MCA framework system. Notes: - This commit also eliminates the need for an available components list in use in several frameworks. None of the code in question was making use of the priority field of the priority component list item so these extra lists were removed. - Cleaned up selection code in several frameworks to sort lists using opal_list_sort. - Cleans up the ompi/orte-info functions. Expose the functions that construct the list of params so they can be used elsewhere. patches for mtl/portals4 from brian missed a few output variables in openib This commit was SVN r28241.	2013-03-27 21:17:31 +00:00
Shiqing Fan	1ed0f40d35	Fix a few type casts on Windows. This commit was SVN r24857.	2011-07-06 08:08:53 +00:00
Brian Barrett	a4b2bd903b	* Implement long-ago discussed RFC to add a callback data pointer in the request completion callback * Use the completion callback pointer to remove all need for opal_progress calls in the one-sided layer This commit was SVN r24848.	2011-06-30 20:05:16 +00:00
Eugene Loh	2770a12beb	Continue clean up of thread options started in r22841, 22842, and 22849. No need for any CMRs to 1.5... that was already done in CMR 2728. This commit was SVN r24545. The following SVN revision numbers were found above: r22841 --> open-mpi/ompi@b400b84162	2011-03-18 21:36:35 +00:00
Rainer Keller	6c5532072a	- Split the datatype engine into two parts: an MPI specific part in OMPI and a language agnostic part in OPAL. The convertor is completely moved into OPAL. This offers several benefits as described in RFC http://www.open-mpi.org/community/lists/devel/2009/07/6387.php namely: - Fewer basic types (int* and float* types, boolean and wchar - Fixing naming scheme to ompi-nomenclature. - Usability outside of the ompi-layer. - Due to the fixed nature of simple opal types, their information is completely known at compile time and therefore constified - With fewer datatypes (22), the actual sizes of bit-field types may be reduced from 64 to 32 bits, allowing reorganizing the opal_datatype structure, eliminating holes and keeping data required in convertor (upon send/recv) in one cacheline... This has implications to the convertor-datastructure and other parts of the code. - Several performance tests have been run, the netpipe latency does not change with this patch on Linux/x86-64 on the smoky cluster. - Extensive tests have been done to verify correctness (no new regressions) using: 1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and ompi-ddt: a. running both trunk and ompi-ddt resulted in no differences (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run correctly). b. with --enable-memchecker and running under valgrind (one buglet when run with static found in test-suite, commited) 2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt: all passed (except for the dynamic/ tests failed!! as trunk/MTT) 3. compilation and usage of HDF5 tests on Jaguar using PGI and PathScale compilers. 4. compilation and usage on Scicortex. - Please note, that for the heterogeneous case, (-m32 compiled binaries/ompi), neither ompi-trunk, nor ompi-ddt branch would successfully launch. This commit was SVN r21641.	2009-07-13 04:56:31 +00:00
Greg Koenig	60485ff95f	This is a very large change to rename several #define values from OMPI_* to OPAL_*. This allows opal layer to be used more independent from the whole of ompi. NOTE: 9 "svn mv" operations immediately follow this commit. This commit was SVN r21180.	2009-05-06 20:11:28 +00:00
Brian Barrett	7f898d4e2b	* Make rdma the default. Somehow, the code didn't match what was supposed to happen * Properly error out (rather than cause buffer overflow) in case where the datatype packed description is larger than our control fragments. This still isn't standards conforming, but at least we know what happened. * Expose win_set_name to external libraries (like the osc modules) * Set default window name to the CID of the communcator it's using for communication Refs trac:1905 This commit was SVN r21134. The following Trac tickets were found above: Ticket 1905 --> https://svn.open-mpi.org/trac/ompi/ticket/1905	2009-04-30 22:36:09 +00:00
Terry Dontje	0178b6c45f	Added padding to predefined handle structures to maintain library version to version compatibility. This commit was SVN r20627.	2009-02-24 17:17:33 +00:00
Rainer Keller	d81443cc5a	- On the way to get the BTLs split out and lessen dependency on orte: Often, orte/util/show_help.h is included, although no functionality is required -- instead, most often opal_output.h, or orte/mca/rml/rml_types.h Please see orte_show_help_replacement.sh commited next. - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration actually showed two missing #include "orte/util/show_help.h" in orte/mca/odls/base/odls_base_default_fns.c and in orte/tools/orte-top/orte-top.c Manually added these. Let's have MTT the last word. This commit was SVN r20557.	2009-02-14 02:26:12 +00:00
Rolf vandeVaart	9c080b27d6	Fix for bug when running 64-bit heterogeneous. This commit fixes trac:1341. This commit was SVN r18940. The following Trac tickets were found above: Ticket 1341 --> https://svn.open-mpi.org/trac/ompi/ticket/1341	2008-07-17 19:04:40 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Ralph Castain	fa082cafa9	Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex. Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer. This commit was SVN r18198.	2008-04-17 20:43:56 +00:00
Shiqing Fan	79da2fdd2c	Use the new memchecker convertor function. Remove some unnecessary memchecker calls. This commit was SVN r18172.	2008-04-16 13:24:35 +00:00
Shiqing Fan	54c7b71cfd	Use the correct way of including memchecker.h, which will work with '--with-devel-headers'. This commit was SVN r17435.	2008-02-12 18:01:17 +00:00
Shiqing Fan	f5792bbda5	merging the memchecker into trunk. This commit was SVN r17424.	2008-02-12 08:46:27 +00:00
Tim Prins	b88a3f7a94	Update onesided components to fix the case (on 64 bit machines) where the total offset is greater than 2^31-1 bytes. See: http://www.open-mpi.org/community/lists/users/2008/01/4880.php This commit was SVN r17400.	2008-02-07 18:45:35 +00:00
Brian Barrett	7a9a8c7e17	Support reduction operations other than MPI_REPLACE for user-defined datatypes with MPI_ACCUMULATE This commit was SVN r15418.	2007-07-13 20:46:12 +00:00
Brian Barrett	739fed9dc9	Don't poke at internal structure fiealds of communicators or groups, but instead use accessor functions This commit was SVN r15366.	2007-07-11 17:16:06 +00:00
Brian Barrett	a2713dcac8	eeks! Bad to notice after committing the pt2pt part of r14806 that the compile failed because of the wrong variable name. This commit was SVN r14807. The following SVN revision numbers were found above: r14806 --> open-mpi/ompi@7e57bbb0ef	2007-05-30 20:33:08 +00:00
Brian Barrett	7e57bbb0ef	React slightly better when datatype creation from a buffer fails This commit was SVN r14806.	2007-05-30 20:32:02 +00:00
Sven Stork	88f0845c44	- let the pt2pt component compile with threads enabled This commit was SVN r14725.	2007-05-23 12:56:34 +00:00
Brian Barrett	38eab3613b	* Fix race condition with the pending_{in,out} variables -- if we're going to do while(...) { } then we can't change the variables in the ... atomically, but should do it while holding the module lock. * Fix dumb communicator creation error when we don't create the progress stuff (because a window already exists), where we would accidently jump to the error case. This commit was SVN r14715.	2007-05-21 20:53:02 +00:00
Brian Barrett	2b4b754925	Some much needed cleanup of the point-to-point one-sided component... * Combine polling of the long requests and buffer requests into one type, and in one place * Associate the list of requests to poll with the component, not the individual modules * add progress thread that sits on the OMPI request structure and wakes up at the appropriate time to poll the message list. Not the best, but without some asynch notification from the PML that a given set of requests has completed, there isn't much better * Instead of calling opal_progress() all over the place, move to using the condition variables like the rest of the project. Has the advantage of moving it slightly futher along in the becoming thread safe thing * Fix a problem with the passive side of unlock where it could go recursive and cause all kinds of problems, especially when progress threads are used. Instead, have two parts of passive unlock -- one to start the unlock, and another to complete the lock and send the ack back. The data moving code trips the second at the right time. This commit was SVN r14703.	2007-05-21 02:21:25 +00:00
Brian Barrett	d9e0e80190	Make some debugging output only looked at when debugging is enabled This commit was SVN r13777.	2007-02-25 01:03:19 +00:00
Brian Barrett	385a435813	Start long message send as soon as possible, to minimze ack time for the receive, greatly increasing mid-range bandwidth This commit was SVN r13317.	2007-01-25 23:07:03 +00:00
Brian Barrett	48ec0b2071	Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix for now... This commit was SVN r12997. The following SVN revision numbers were found above: r12974 --> open-mpi/ompi@27cea44a9c	2007-01-04 22:07:37 +00:00
Brian Barrett	27cea44a9c	Fix a number of issues with the ompi_ptr_t: * Make sure that the pval always writes to the correct portion of the lval. This only matters on 32 bit big endian machines. * On 32 bit machines when assigning to pval, the other 4 bytes of lval weren't being written, which could lead to bogus data We use macros so that there aren't casts all over the code and the pval assignment can occur to the correct 4 bytes. Refs trac:587 This commit was SVN r12974. The following Trac tickets were found above: Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587	2007-01-03 19:47:48 +00:00
Brian Barrett	beb1e9d4dd	* finish move from hard coded tag to #define'd constant tag This commit was SVN r12674.	2006-11-27 21:55:41 +00:00
Brian Barrett	0c25f7be09	More One-sided fixes: * Fix a counter roll-over issue that could result from a large (but not excessive) number of outstanding put/get/accumulate calls during a single synchronization issues (Refs trac:506) * Fix epoch issue with rdma component that would effect PWSC synchronization (Refs trac:507) This commit was SVN r12673. The following Trac tickets were found above: Ticket 506 --> https://svn.open-mpi.org/trac/ompi/ticket/506 Ticket 507 --> https://svn.open-mpi.org/trac/ompi/ticket/507	2006-11-27 21:41:29 +00:00
Brian Barrett	63e5668e29	Number of one-sided fixes: * use one-sided datatype check instead of send/receive and check both the origin and target datatypes * allow error handler to be set on MPI_WIN_NULL, per standard * Allow recursive calls into the pt2pt osc component's progress function * Fix an uninitialized variable problem in the unlock header This commit was SVN r12667.	2006-11-27 03:22:44 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	8852c00c36	Look like a big commit but in fact it address only one issue. The way we're working with size and diplacement of data-type. After this patch all data can contain size_t bytes and the displacements are defined as ptrdiff_t. All of the files I was able to compile have been modified to match this requirement. This commit was SVN r12146.	2006-10-17 20:20:58 +00:00
George Bosilca	688a16ea78	A long time waiting patch. Get rid of the comm->c_pml_procs. It was (and that was long ago) supposed to be used as a cache for accessing the PML procs. But in all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc. This pointer can be accessed using the c_remote_group easily. Therefore, there is no meaning of keeping the PML procs around. Slim fast commit ... This commit was SVN r11730.	2006-09-20 22:14:46 +00:00
George Bosilca	3f0a7cad9e	The last patch for Windows support. Mostly casting and conversion to C++ friendly headers. This commit was SVN r11400.	2006-08-24 16:38:08 +00:00
Brian Barrett	df84dbad00	* use the osc base debugging stream for all output, and do the whole verbose MCA param thing instead of changing -1 to 0 and back in the output stream param. This commit was SVN r11245.	2006-08-17 14:52:20 +00:00
Brian Barrett	9f28258b3f	* squelch stupid compiler warning This commit was SVN r11111.	2006-08-03 14:42:05 +00:00
Brian Barrett	0ba0a60ada	* Merge in new version of the pt2pt one-sided communication component, implemented entirely on top of the PML. This allows us to have a one-sided interface even when we are using the CM PML and MTLs for point-to-point transport (and therefore not using the BML/BTLs) * Old pt2pt component was renamed "rdma", as it will soon be having real RDMA support added to it. Work was done in a temporary branch. Commit is the result of the merge command: svn merge -r10862:11099 https://svn.open-mpi.org/svn/ompi/tmp/bwb-osc-pt2pt This commit was SVN r11100. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r10862 r11099	2006-08-03 00:10:19 +00:00
Brian Barrett	2185c059e8	* use opal_free_list_item_t as the type of items stored in an opal_free_list_t, rather than assuing it's an opal_list_item_t. This commit was SVN r10860.	2006-07-17 21:51:50 +00:00

1 2

65 Коммитов