openmpi

Автор	SHA1	Сообщение	Дата
Gleb Natapov	d7375ec102	Fix deadlock reported by Andrew Friedley: What's happening is that we're holding openib_btl->eager_rdma_lock when we call mca_btl_openib_endpoint_send_eager_rdma() on btl_openib_endpoint.c:1227. This in turn calls mca_btl_openib_endpoint_send() on line 1179. Then, if the endpoint state isn't MCA_BTL_IB_CONNECTED or MCA_BTL_IB_FAILED, we call opal_progress(), where we eventually try to lock openib_btl->eager_rdma_lock at btl_openib_component.c:997. The fix removes this lock altogether. Instead we atomically set local RDMA pointer to prevent other threads to create rdma buffer for the same endpoint. And we increment eager_rdma_buffers_count atomically thus polling thread doesn't need lock around it. This commit was SVN r12369.	2006-10-31 09:54:52 +00:00
Gleb Natapov	1b152dfe09	On 64 bit platform if high 32 bits of buf address is not zero they are trimmed by wrong bitwise and. Fix it by expanding mask to 64 bits. This commit was SVN r12368.	2006-10-31 07:33:35 +00:00
Jeff Squyres	63155bca09	Refs trac:496 Have no idea why this function always returns a failure. It should always return SUCCESS (provided the status is value). This commit was SVN r12364. The following Trac tickets were found above: Ticket 496 --> https://svn.open-mpi.org/trac/ompi/ticket/496	2006-10-30 22:44:23 +00:00
Jeff Squyres	0b2616173a	Fixes trac:549 * For MPI_TEST, MPI_TESTANY, MPI_WAIT, and MPI_WAITANY (i.e., the TEST/WAIT functions that return up to exactly one completed request), return the actual error code. * For MPI_TESTALL, MPI_TESTSOME, MPI_WAITALL, MPI_WAITSOME, (i.e., the TEST/WAIT functions that can return more than one completed request), return MPI_ERR_IN_STATUS. This commit was SVN r12355. The following Trac tickets were found above: Ticket 549 --> https://svn.open-mpi.org/trac/ompi/ticket/549	2006-10-30 19:50:09 +00:00
Jeff Squyres	d66f7526fa	Ensure to FINI the request (e.g., release fortran resources). This commit was SVN r12353.	2006-10-30 16:02:29 +00:00
Jeff Squyres	b6899e1085	Ensure to set the request to MPI_REQUEST_NULL when we release it. This commit was SVN r12352.	2006-10-30 14:43:26 +00:00
Jeff Squyres	2de51fca63	Need to return the error code. This commit was SVN r12351.	2006-10-30 14:15:44 +00:00
Gleb Natapov	7b39039cd6	Add comments to process_pending functions. This commit was SVN r12346.	2006-10-29 09:12:24 +00:00
Gleb Natapov	8ef5b6a589	Change tabs to spaces to be consistent with the rest of the file. This commit was SVN r12345.	2006-10-29 08:12:44 +00:00
George Bosilca	a9c6ae8f15	Minimize the number of branches, and orce the correct prediction for the most usual one. Most of the time we expect the functions which allocate requests to succeed. This commit was SVN r12344.	2006-10-27 23:16:13 +00:00
George Bosilca	44f3dd81b4	Update the comment to reflect what's inside the code. This commit was SVN r12343.	2006-10-27 23:09:37 +00:00
George Bosilca	3472d19d4d	Do not modify the convertor if there is no data to be send across the network. The req_bytes_packed field is initialized in the BASE_INIT macro, so it is set for all requests at this stage. This commit was SVN r12342.	2006-10-27 23:03:15 +00:00
Jeff Squyres	020efdf1f9	Refs trac:250 This commit essentially caches the invoking comm/win/file on the ompi_request_t. This, paired with the req_type field, allows us to retrieve the invoking MPI object and invoke the proper errhandler. The patch is missing most updates for the MPI-2 one-sided stuff (i.e., the patch mainly fixes comms and files); I didn't really understand that code and didn't want to hazard trying to figure it out when Brian can probably do it much more quickly. So #250 will still stay open, pending MPI-2 one-sided updates for this stuff. This commit was SVN r12339. The following Trac tickets were found above: Ticket 250 --> https://svn.open-mpi.org/trac/ompi/ticket/250	2006-10-27 12:35:27 +00:00
Jeff Squyres	e02114dcf3	Fixes trac:529. * Create a new request type: NOOP (described below) * For all MPI__INIT functions, OBJ_NEW an ompi_request_t and set its type to NOOP Ensure that the NOOP requests are OBJ_RELEASE'd when they are done * MPI_START looks at the request type; if NOOP, just return success. If not, call the PML start() function * MPI_STARTALL always pass the entire array of requests back to the PML (see next point) * Make the PMLs only process PML requests (i.e., ignore/skip anything that isn't of type PML -- such as the NOOP requests) * Add a little more param error checking in STARTALL This commit was SVN r12338. The following Trac tickets were found above: Ticket 529 --> https://svn.open-mpi.org/trac/ompi/ticket/529	2006-10-27 12:32:36 +00:00
Jeff Squyres	477424c537	Fixes trac:532 * Remove an extra OMPI_REQUEST_INIT() from the grequest constructor (it was already invoked by the parent MPI_Request constructor) * Set the state of the generalized request to ACTIVE (because this is invoked from MPI_GREQUEST_START -- analogous to MPI_START) * Before invoking the query function in MPI_REQUEST_COMPLETE, set the status on the base request to ompi_status_empty. This gives a set of default values for the request, including one for status.MPI_ERROR = MPI_SUCCESS (because we check the value of MPI_ERROR in MPI_TEST* and MPI_WAIT* processing, and use it to determine whether the upper-level API call should raise an MPI exception or not). This commit was SVN r12337. The following Trac tickets were found above: Ticket 532 --> https://svn.open-mpi.org/trac/ompi/ticket/532	2006-10-27 12:28:07 +00:00
George Bosilca	882b429f64	ompi_mtl_datatype_pack is not a data-type function (really) so it still need the free_after (which btw has a different meaning that the one removed from the data-type engine few minutes ago). This commit was SVN r12333.	2006-10-27 00:15:53 +00:00
George Bosilca	393657ee26	Initialize the sndbuf in all cases. Do not forget to initialize the tree used in each of the broadcast functions. This commit was SVN r12332.	2006-10-27 00:13:33 +00:00
George Bosilca	126a68dc9a	Big datatype commit. Remove all unused features of the datatype engine. As the memory allocation logic is completely done outside the data-type engine (in the PML) there is no need for any special case inside the data-type engine. There is less arguments for the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is not required anymore as there is no memory allocated in the engine itself). This change affect all components using datatypes. I test most of them, but it might happens that I miss some ... If it's the case please let me know (don't shoot the pianist!!). This commit was SVN r12331.	2006-10-26 23:11:26 +00:00
George Bosilca	a1a4f7c422	Reset the segment pointer once we release the self fragment. This commit was SVN r12330.	2006-10-26 23:07:14 +00:00
George Bosilca	be8516e0d7	Anothers indentations. This commit was SVN r12329.	2006-10-26 23:06:15 +00:00
George Bosilca	83dfd36c1f	Indentations. This commit was SVN r12328.	2006-10-26 23:05:41 +00:00
George Bosilca	91ab093e96	Cleanup. No extern required for the function prototypes. This commit was SVN r12327.	2006-10-26 23:03:12 +00:00
George Bosilca	ba3c247f2a	Big collective commit. I lightly test it, but I think it should be quite stable. Anyway, the default decision functions (for broadcast, reduce and barrier) are based on a high performance network (not TCP). It should give good performance (really good) for any network having the following caracteristics: small latency (5 microseconds) and good bandwidth (more than 1Gb/s). + Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most of the reduce algorithms use a generic tree based function for completing the reduce. + Added macros for computing the trees (they are used for bcast and reduce right now). + Allow the usage of all 5 topologies. + Jelena's implementation of a binary tree that can be used for non commutative operations. Right now only the tree building function is there, it will get activated soon. + Some others minor cleanups. This commit was SVN r12326.	2006-10-26 22:53:05 +00:00
Andrew Friedley	c752502dee	Fix for a common race condition when running the Sandia mt_send_recv.cc test. A segfault would occur in mca_pml_ob1_recv_request_progress() when trying to prepare the convertor for unpacking, because the request's req_proc field was NULL. Turns out that we weren't setting the req_proc field in the MCA_PML_OB1_CHECK_SPECIFIC_AND_WILD_RECEIVES_FOR_MATCH macro. Instead of just setting it there I removed the other place req_proc was being set correctly, and instead took care of all the cases at once in mca_pml_ob1_recv_frag_match(). This commit was SVN r12323.	2006-10-26 19:09:39 +00:00
Gleb Natapov	90be664b9f	Some process_pending() functions get bml_btl on which resource was freed as a parameter. For optimisation purpose only this BTL is used to send packet through instead of trying to send packets through all BTLs. But actually the code was wrong. It simply used provided bml_btl and it may represent different endpoint from packet's destination. The fixed code checks if packet's destination is reachable through the BTL, finds appropriate bml_btl and only then tries to send it through correct bml_btl. This commit was SVN r12319.	2006-10-26 13:21:47 +00:00
Sven Stork	5861ed865d	- Add parameter checking as required by the standard. This commit was SVN r12318.	2006-10-26 09:18:21 +00:00
Sven Stork	9024c5be4b	- Fix wrong error values. This commit was SVN r12317.	2006-10-26 08:26:03 +00:00
Terry Dontje	7259d1b512	Adjust allocation size to be a quantity divisible by sizeof(size_t). This is done to assure alignment so strictly aligned CPUs (like SPARC) do not sigbus. This also may benefit other platforms too. This commit fixes trac:494. This commit was SVN r12312. The following Trac tickets were found above: Ticket 494 --> https://svn.open-mpi.org/trac/ompi/ticket/494	2006-10-25 18:22:38 +00:00
Ralph Castain	36d4511143	Bring the timing instrumentation to the trunk. If you want to look at our launch and MPI process startup times, you can do so with two MCA params: OMPI_MCA_orte_timing: set it to anything non-zero and you will get the launch time for different steps in the job launch procedure. The degree of detail depends on the launch environment. rsh will provide you with the average, min, and max launch time for the daemons. SLURM block launches the daemon, so you only get the time to launch the daemons and the total time to launch the job. Ditto for bproc. TM looks more like rsh. Only those four environments are currently supported - anyone interested in extending this capability to other environs is welcome to do so. In all cases, you also get the time to setup the job for launch. OMPI_MCA_ompi_timing: set it to anything non-zero and you will get the time for mpi_init to reach the compound registry command, the time to execute that command, the time to go from our stage1 barrier to the stage2 barrier, and the time to go from the stage2 barrier to the end of mpi_init. This will be output for each process, so you'll have to compile any statistics on your own. Note: if someone develops a nice parser to do so, it would be really appreciated if you could/would share! This commit was SVN r12302.	2006-10-25 15:27:47 +00:00
George Bosilca	2d17f0fa9d	First step on supporting full external32 conversion on both operations pack and unpack. This commit was SVN r12299.	2006-10-25 14:33:06 +00:00
George Bosilca	ccbe5ee016	Remove unused code. This commit was SVN r12297.	2006-10-25 13:59:57 +00:00
Jeff Squyres	126a8b1e22	If we pass an erroneous error code in, don't segv. Instead, return a "this is bogus" kind of answer. Passing in bad error codes should only happen in erroneous sections of the OMPI code base, but still, it's far more social to print a message saying, "hey, you messed up!" rather than seg faulting. Reviewed by Edgar. This commit was SVN r12295.	2006-10-25 13:37:45 +00:00
Sven Stork	f3f39e003e	- Increment the pipeline depth before we trigger the send function. As mentioned in the comment the completion/callback of the triggered send operation can happen before the call returns. If this happens and if the pipeline depth is 0 before we triggered the send operation and this is the last send operation of the request then the completion detection code will decrement the pipeline depth and check it for equality to 0. Because (0-1) != 0 the pml completion function for this request will not be called. This part 2 of the fix for ticket #246. This commit was SVN r12292.	2006-10-25 08:52:39 +00:00
Sven Stork	3563f15fde	- Fix a bug in descriptor handling code. The self BTL was mixing the different kinds of descriptors (e.g. put rdma descriptor in the eager free-list). This part 1 of the fix for ticket #246. This commit was SVN r12291.	2006-10-25 08:45:29 +00:00
George Bosilca	99631ccf66	Cleanups. This commit was SVN r12272.	2006-10-23 22:29:17 +00:00
George Bosilca	d7d3f9e486	Tuned collectives works only for at least 2 processes. We have the self module for the other cases. This commit was SVN r12271.	2006-10-23 22:28:56 +00:00
George Bosilca	b848a5ad06	Remove all ompi_coll_chain_t references. This commit was SVN r12269.	2006-10-23 21:47:50 +00:00
George Bosilca	39cd8d3d17	One to rule them all. We only need one topology information: a tree. How we build it it's hat make the difference. This commit was SVN r12268.	2006-10-23 21:46:30 +00:00
Rolf vandeVaart	272f766c5f	Fix for ticket #219 MPI::Grequest is missing from C++ API. I did the initial implementation and Jeff fixed it up. Passes a new test in trunk/simple/basic/cxx/grequest.cc. This commit was SVN r12264.	2006-10-23 20:17:30 +00:00
George Bosilca	9cf3040e5f	Allocate enough memory for the reduce operation when MPI_IN_PLACE is specified. This commit was SVN r12260.	2006-10-23 17:51:36 +00:00
Edgar Gabriel	8b09bd181f	just a reordering of the arguments in a comparison in order to comply with the OMPI programming style... This commit was SVN r12259.	2006-10-23 17:14:23 +00:00
George Bosilca	6b697ad3dd	If the operation is not commutative then force the basic reducve algorithm. The others cannot be used for non commutative operations ... yet ... This commit was SVN r12241.	2006-10-20 22:11:44 +00:00
George Bosilca	a7b6078b73	No more segfault. Still some wrong data around ... This commit was SVN r12238.	2006-10-20 20:17:34 +00:00
George Bosilca	02759cf515	Update the reduce chain collective. This commit was SVN r12237.	2006-10-20 19:47:52 +00:00
George Bosilca	b51b87a4aa	The correct way to compute the difference between the actual size and the expected size, based on the comment few lines before. This commit was SVN r12235.	2006-10-20 19:33:55 +00:00
George Bosilca	d7268557a8	Complete the SM BTL changes. Now all displacements are ptrdiff_t and there is no warnings about any issue with signed/unsigned. This commit was SVN r12234.	2006-10-20 19:28:12 +00:00
Mohamad Chaarawi	08a9b6458c	fixed the MPI_Translate_ranks issues reported earlier, where a rank of MPI_PROC_NULL translates to MPI_PROC_NULL, and an MPI_GROUP_EMPTY as one of the groups doesn't cause a segmentation fault, but returns MPI_UNDEFINED for all ranks to be translated. This commit was SVN r12233.	2006-10-20 19:13:49 +00:00
George Bosilca	c86214f420	Fix the SM BTL issues. The problem seems to come from the fact that the maximum number of nodes on the SM file should be signed, as we use the -1 to unlimit it. This commit was SVN r12227.	2006-10-20 17:25:53 +00:00
Brian Barrett	37fad860b7	Grrr... Forgot that EXTRA_DIST and man_MANS are not set to include all the possible things contained in the conditional like other rules are (for example, a SOURCES rule in a conditional automatically has its files added to the dist rules, even if that conditional isn't tru when make dist occurs). So the man files weren't in the tarball. Put the EXTRA_DIST with the files explicitly listed outside any conditionals so the man pages always end up in the tarball. This commit was SVN r12220.	2006-10-20 14:15:38 +00:00
George Bosilca	06563b5dec	Last set of explicit conversions. We are now close to the zero warnings on all platforms. The only exceptions (and I will not deal with them anytime soon) are on Windows: - the write functions which require the length to be an int when it's a size_t on all UNIX variants. - all iovec manipulation functions where the iov_len is again an int when it's a size_t on most of the UNIXes. As these only happens on Windows, so I think we're set for now :) This commit was SVN r12215.	2006-10-20 03:57:44 +00:00

1 2 3 4 5 ...

2040 Коммитов