openmpi

Автор	SHA1	Сообщение	Дата
Artem Polyakov	541715572f	Fix MPI_Waitany and MPI_Waitsome (request handling related)	2016-06-28 16:40:00 +03:00
Nathaniel Graham	bb9485bcd9	Fix Java Coverity issue Fixing a possible error that Coverity pointed out in ompi_java_exceptionCheck. Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>	2016-06-23 15:09:07 +02:00
Nathan Hjelm	143a93f379	opal/sync: remove usage of OPAL_ENABLE_MULTI_THREADS The OPAL_ENABLE_MULTI_THREADS macro is always defined as 1. This was causing us to always use the multi-thread path for synchronization objects. The code has been updated to use the opal_using_threads() function. When MPI_THREAD_MULTIPLE support is disabled at build time (2.x only) this function is a macro evaluating to false so the compiler will optimize out the MT-path in this case. The OPAL_ATOMIC_ADD_32 macro has been removed and replaced by the existing OPAL_THREAD_ADD32 macro. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-22 09:52:37 -06:00
Nathaniel Graham	679a66ccc8	Merge pull request #1803 from nrgraham23/jni_error_handling Java bindings exception handling fix	2016-06-22 04:14:00 -07:00
Nathan Hjelm	7bd7c0578b	Merge pull request #1807 from hjelmn/request_perfm_regression ompi/request: fix performance regression	2016-06-21 14:47:56 -06:00
Nathan Hjelm	544adb9aed	ompi/request: fix performance regression This commit fixes a performance regression introduced by the request rework. We were always using the multi-thread path because OPAL_ENABLE_MULTI_THREADS is either not defined or always defined to 1 depending on the Open MPI version. To fix this I removed the conditional and added a conditional on opal_using_threads(). This path will be optimized out in 2.0.0 in a non-thread-multiple build as opal_using_threads is #defined to false in that case. Fixes open-mpi/ompi#1806 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-21 11:45:32 -06:00
Nathan Hjelm	2409024c17	osc/rdma: fix typo Need to increment the total size after checking the local offset not before. This typo causes large allocations with MPI_Win_allocate() to fail. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-21 09:50:29 -06:00
Nathaniel Graham	88dea4e4de	Java bindings exception handling fix Fixed an error where if there were no MPI exceptions, a JNI error could still exist and not get handled. Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>	2016-06-21 12:40:31 +02:00
George Bosilca	9c4f56be4b	Fix the coll_base_sendrecv function.	2016-06-18 18:23:51 +02:00
Nathan Hjelm	3a69b727a6	Merge pull request #1788 from hjelmn/split_type comm/split_type: allow MPI_UNDEFINED for split_type	2016-06-16 21:12:25 -06:00
Nathan Hjelm	65be935676	comm/split_type: allow MPI_UNDEFINED for split_type It is valid for any rank to deviate on the split_type argument if they specify MPI_UNDEFINED. The code was incorrectly not allowing this condition. Changed the split_type uniformity check and allow local_size to be 0 if the local split_type is MPI_UNDEFINED. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-16 17:42:28 -06:00
rhc54	702a982271	Merge pull request #1767 from rhc54/topic/pmix2 Enable the PMIx event notification capability	2016-06-16 15:27:43 -07:00
Nathan Hjelm	e135543cb0	Merge pull request #1785 from hjelmn/malloc_hook_fix opal/memory: disable __malloc_initialize_hook if poisoned	2016-06-15 14:55:44 -06:00
Nathan Hjelm	7018aeda2b	opal/memory: disable __malloc_initialize_hook if poisoned Newer versions of gcc have "poisoned" the __malloc_initialize_hook name and it can no longer be used. Added a configure check and protection around its usage. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-15 12:00:49 -06:00
KAWASHIMA Takahiro	dff6accec6	ompi/datatype: Fix args of DARRAY According to MPI-3.1 P.122, `ni` for `MPI_COMBINER_DARRAY` should be `4ndims+4`, not `4size+4`. This bug may cause SEGV if `size` is smaller than `ndims` when the darray is used for one-sided communication (pt2pt OSC). This bug was introduced in open-mpi/ompi@79b13f36 (when darray became a first class citizen and the `a_i` index of darray was shifted by 2). The corresponding `MPI_Type_create_darray()` function sets a right value so we don't need to update the function.	2016-06-15 11:24:22 +09:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Jeff Squyres	c2185bb4b8	Merge pull request #1781 from jsquyres/pr/disable-psm-psm2-signal-hijacking PSM/PSM2: Disable signal handler hijacking by default	2016-06-14 15:33:24 -04:00
Jeff Squyres	5071602c59	PSM/PSM2: Disable signal handler hijacking by default Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some subsequent phone calls and off-issue email discussions), the PSM library is hijacking signal handlers by default. Specifically: unless the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel TrueScale) is set, the library constructor for this library will hijack various signal handlers for the purpose of invoking its own error reporting mechanisms. This may be a bit surprising, but is not a problem, per se. The real problem is that older versions of at least the PSM library do not unregister these signal handlers upon being unloaded from memory. Hence, a segv can actually result in a double segv (i.e., the original segv and then another segv when the now-non-existent signal handler is invoked). This PSM signal hijacking subverts Open MPI's own signal reporting mechanism, which may be a bit surprising for some users (particularly those who do not have Intel TrueScale). As such, we disable it by default so that Open MPI's own error-reporting mechanisms are used. Additionally, there is a typo in the library destructor for the PSM2 library that may cause problems in the unloading of its signal handlers. This problem can be avoided by setting `HFI_NO_BACKTRACE=1` (for PSM2 / Intel OmniPath). This is further compounded by the fact that the PSM / PSM2 libraries can be loaded by the OFI MTL and the usNIC BTL (because they are loaded by libfabric), even when there is no Intel networking hardware present. Having the PSM/PSM2 libraries behave this way when no Intel hardware is present is clearly undesirable (and is likely to be fixed in future releases of the PSM/PSM2 libraries). This commit sets the following two environment variables to disable this behavior from the PSM/PSM2 libraries (if they are not already set): * IPATH_NO_BACKTRACE=1 * HFI_NO_BACKTRACE=1 If the user has set these variables before invoking Open MPI, we will not override their values (i.e., their preferences will be honored). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-14 11:45:23 -07:00
Edgar Gabriel	1ddfd6cdca	io/ompio: fix the preallocate function handle preallocating sizes less than the current file size correctly.	2016-06-14 10:50:32 -05:00
KAWASHIMA Takahiro	84b110a1f2	ompi/datatype: Fix args of HINDEXED_BLOCK According to MPI-3.1 P.121, `ni` for `MPI_COMBINER_HINDEXED_BLOCK` should be `2`, not `2 + count`. This bug was introduced in 113b45b4 (when `MPI_Type_create_hindexed_block` support is added in Open MPI) and fixed partially in 7f5314ee and 8de93982. This commit fixes the remaining part. Probably this bug has no user impact. It only consumes a bit more memory.	2016-06-10 17:32:33 +09:00
Gilles Gouaillardet	80e362de52	coll/base: fix memory free in ompi_coll_base_allreduce_intra_recursivedoubling err handler Fix CID 1362630 Fixes open-mpi/ompi@0e393195d9	2016-06-09 13:12:25 +09:00
Gilles Gouaillardet	ead7efef3f	coll/basic: silence CID 1362614 in mca_coll_basic_allreduce_inter()	2016-06-09 09:40:19 +09:00
Gilles Gouaillardet	ad2e1a5ae9	coll/base: silence CID 1362613 in ompi_coll_base_alltoall_intra_basic_linear()	2016-06-09 09:40:05 +09:00
Gilles Gouaillardet	80b267af1c	coll/base: silence CID 1362601 in ompi_coll_base_sendrecv_zero()	2016-06-09 09:37:31 +09:00
Gilles Gouaillardet	0e393195d9	coll/base: fix [all]reduce with non zero lower bound datatypes Offset temporary buffer when a non zero lower bound datatype is used. Thanks Hristo Iliev for the report	2016-06-08 16:48:00 +09:00
Nathan Hjelm	97c1643216	Merge pull request #1766 from hjelmn/req_fix ompi/request: fix loop conditional	2016-06-07 12:11:56 -06:00
Nathan Hjelm	3ddf3ccbf3	Merge pull request #1758 from hjelmn/ob1_fixes pml/ob1: bug fixes	2016-06-07 11:18:55 -06:00
Nathan Hjelm	5a4adb866d	ompi/request: fix loop conditional This commit fixes a bug in waitany that causes the code to go past the beginning of the request array. The loop conditional i >= 0 is invalid since i is unsigned. Changed to loop to check (i+1) > 0. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-07 10:28:46 -06:00
Todd Kordenbrock	9671d6af47	Merge pull request #1689 from francois-wellenreiter/remove_trig_rdv_portals4 MTL portals4 : remove the triggered rendez-vous protocol	2016-06-06 21:55:01 -05:00
Nathan Hjelm	5d0b4679ea	pml/ob1: bug fixes This commit fixes two bugs in pml/ob1: - Do not called MCA_PML_OB1_PROGRESS_PENDING from mca_pml_ob1_send_request_start_copy as this may lead to a recursive call to mca_pml_ob1_send_request_process_pending. - In mca_pml_ob1_send_request_start_rdma return the rdma frag object if a btl fragment can not be allocated. This fixes a leak identified by @abouteiller and @bosilca. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-06 17:54:55 -06:00
Gilles Gouaillardet	544a2f1631	configury: fix mpifort and oshmemfort wrapper data NAG compiler use gcc (and not ld) as a linker, so in order to pass an option to the linker, the flag is -Wl,-Wl,,<option> and not -Wl,<option> Thanks Paul Hargrove for the report	2016-06-06 11:54:12 +09:00
Gilles Gouaillardet	c976559877	coll/basic: fix log basic bcast The log basic bcast was completely broken. The rank 0 gets the hibit set to -1, so it always returned an error.	2016-06-06 11:01:51 +09:00
Gilles Gouaillardet	99fedcb7a3	fs/base: silence a memory leak in mca_fs_base_get_fstype() Fixes CID 1351211	2016-06-06 09:20:14 +09:00
George Bosilca	9376b0340b	Fix the basic barrier. The log basic barrier was completely broken. The rank 0 gets the hibit set to 0, so it always returned an error.	2016-06-03 23:46:25 -04:00
Edgar Gabriel	d6af5444a6	fix the get_byte_offset code	2016-06-03 11:36:53 -05:00
Josh Hursey	9f9f70ee50	Merge pull request #1746 from jjhursey/topic/op-init ompi/op: Provide a default value for type/flags	2016-06-03 07:56:29 -05:00
Nathan Hjelm	e968ddfe64	start bug fixes (#1729 ) * mpi/start: fix bugs in cm and ob1 start functions There were several problems with the implementation of start in Open MPI: - There are no checks whatsoever on the state of the request(s) provided to MPI_Start/MPI_Start_all. It is erroneous to provide an active request to either of these calls. Since we are already looping over the provided requests there is little overhead in verifying that the request can be started. - Both ob1 and cm were always throwing away the request on the initial call to start and start_all with a particular request. Subsequent calls would see that the request was pml_complete and reuse it. This introduced a leak as the initial request was never freed. Since the only pml request that can be mpi complete but not pml complete is a buffered send the code to reallocate the request has been moved. To detect that a request is indeed mpi complete but not pml complete isend_init in both cm and ob1 now marks the new request as pml complete. - If a new request was needed the callbacks on the original request were not copied over to the new request. This can cause osc/pt2pt to hang as the incoming message callback is never called. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov> * osc/pt2pt: add request for gc after starting a new request Starting a new receive may cause a recursive call into the pt2pt frag receive function. If this happens and the prior request is on the garbage collection list it could cause problems. This commit moves the gc insert until after the new request has been posted. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-06-02 20:22:40 -04:00
Matias A Cabral	29ab28f4f6	Adding owner.txt file for PSM2 MTL.	2016-06-02 16:26:16 -07:00
Joshua Hursey	a776d78f2d	ompi/op: Provide a default value for type/flags * User defined ops leave the op_type unset which can confuse logic in a collective component that is trying to convert the op to the approprate local function.	2016-06-02 13:59:04 -05:00
George Bosilca	d577e12dd0	Fix comment.	2016-06-03 00:57:31 +09:00
George Bosilca	fc5d458249	Consistency in handling OPAL_ENABLE_FT_CR. I am not sure if we should continue to maintain the request support for FT_CR, but I tried here to simplify the code while maintaining the same meaning.	2016-06-03 00:54:24 +09:00
Nathan Hjelm	b001184e63	request: fix warnings (#1742 ) Fix warnings introduced by request rework. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-06-02 04:53:16 -04:00
George Bosilca	bfcf145613	Refactor the request test and wait functions.	2016-06-02 11:58:25 +09:00
George Bosilca	2e1b1d34c6	Safety first !	2016-06-02 11:52:43 +09:00
George Bosilca	50cec456fb	ompi_request_complete with signal Rewrite the ompi_request_complete function to take in account the with_signal argument. Change the comment to explain the expected behavior. Alter all the ompi_request_complete uses to make sure the status of the request is set before calling ompi_request_complete. bot🏷️enhancement	2016-06-02 11:49:12 +09:00
George Bosilca	223d75595d	Give a boost to MPI_Barrier. Based on current implementation it is faster to use a blocking send than the non-blocking version. Switch the exchange function used in the barrier to use the blocking version combined with the non-blocking version of the receive.	2016-06-02 11:45:25 +09:00
Ralph Castain	2c086e56be	Add an experimental ability to skip the RTE barriers at the end of MPI_Init and the beginning of MPI_Finalize	2016-06-01 17:01:15 -07:00
Nathan Hjelm	086ffc1838	pml/ob1: fix race on pml completion of send requests The request code was setting the request as pml_complete before calling MCA_PML_OB1_SEND_REQUEST_MPI_COMPLETE. This was causing MCA_PML_OB1_SEND_REQUEST_RETURN to be called twice in some cases. The code now mirrors the recvreq code and only sets the request as pml complete if the request has not already been freed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-01 13:36:06 -06:00
Gilles Gouaillardet	5f565dfec3	configury: clean the flex generated .c files	2016-06-01 11:13:31 +09:00
Gilles Gouaillardet	1bbc5fadee	ompi/win: silence an other warning	2016-05-31 13:18:39 +09:00

1 2 3 4 5 ...

9027 Коммитов