openmpi

Автор	SHA1	Сообщение	Дата
Ralph Castain	0e433eaa78	Silence warning	2016-07-11 19:43:02 -07:00
Nathan Hjelm	b47208e909	osc/rdma: fix bug in CAS This commit fixes a bug in the RDMA compare-and-swap implementation that caused the origin value to always be written even if the compare should have failed. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-07-11 09:54:23 -06:00
Edgar Gabriel	c8b1c6cae1	Merge pull request #1856 from edgargabriel/pr/zero-size-iread-iwrite io/ompio: fix the request in case of a zero size write/read operation	2016-07-11 08:19:02 -05:00
Gilles Gouaillardet	14624506df	coll/libnbc: do not exchange data between roots in ompi_coll_libnbc_ireduce_scatter_inter() this is now useless since the scatter is done via the local communicator	2016-07-11 17:18:30 +09:00
Edgar Gabriel	3dd81e9e09	io/ompio: fix the request in case of a zero size write/read operation	2016-07-08 14:11:22 -05:00
Gilles Gouaillardet	a55d57406b	coll/base: fix non zero lower bound datatype handling in mca_coll_base_alltoallv_intra_basic_inplace()	2016-07-08 16:55:26 +09:00
Gilles Gouaillardet	7b8094aac1	coll/base: silence misc warning as reported by Coverity with CIDs 1363349-1363362 Offset temporary buffer when a non zero lower bound datatype is used. Thanks Hristo Iliev for the report (cherry picked from commit `0e393195d9`)	2016-07-08 13:06:26 +09:00
Gilles Gouaillardet	678d08647b	coll/libnbc: various fixes - correctly handle non commutative operators - correctly handle non zero lower bound ddt - correctly handle ddt with size > extent - revamp NBC_Sched_op so it takes two buffers and matches ompi_op_reduce semantic - various fix for inter communicators Thanks Yuki Matsumoto for the report	2016-07-07 15:55:49 +09:00
Gilles Gouaillardet	3e559a14a9	coll/inter: fix non standard ddt handling - correctly handle non zero lower bound ddt - correctly handle ddt with size > extent Thanks Yuki Matsumoto for the report	2016-07-07 15:49:59 +09:00
Gilles Gouaillardet	488d037d51	coll/basic: fix non standard ddt handling - correctly handle non zero lower bound ddt - correctly handle ddt with size > extent Thanks Yuki Matsumoto for the report	2016-07-07 15:49:53 +09:00
Gilles Gouaillardet	c06fb04a9a	coll/base: fix non zero lower bound ddt handling in ompi_coll_base_reduce_intra_basic_linear() Thanks Yuki Matsumoto for the report	2016-07-07 15:49:48 +09:00
Ralph Castain	ee56d9dc1a	Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field	2016-07-05 14:59:50 -07:00
George Bosilca	eac5b3c668	Various cleanups in the monitoring PML.	2016-07-05 18:31:25 +02:00
George Bosilca	73972768f8	Remove an apparently useless function.	2016-07-05 18:30:11 +02:00
Josh Hursey	59bf1f0c41	Merge pull request #1836 from jjhursey/topic/coll-nbc-0-count-ireduce mpi/c: Add each check for count==0 in nonblocking reduce interface	2016-07-01 15:22:37 -05:00
Josh Hursey	9b4ed968a4	Merge pull request #1833 from jjhursey/topic/op-init-fix op: Add a default value for MPI_OP o_name	2016-07-01 15:22:15 -05:00
Joshua Hursey	0671e45de0	op: Add a default value for MPI_OP o_name	2016-07-01 13:46:01 -05:00
Joshua Hursey	96779f68e8	mpi/c: Add each check for count==0 in nonblocking reduce interface * Matches the blocking versions of these interfaces - `iallreduce.c` to match `allreduce.c` - `ireduce.c` to match `reduce.c` - `ireduce_scatter.c` to match `reduce_scatter.c` * Workaround for IMB-NBC benchmark, similar to the workaround in place for the IMB-MPI1 benchmark for the blocking collectives.	2016-07-01 13:45:30 -05:00
Joshua Hursey	0a09f8bc51	coll/hcoll: Protect module destruct when not fully initialized * If hcoll is given a negative priority, but not enabled=0 then the module is constructed, but then destructed before calling it's query(). So the previous pointers are not initialized. If we try to OBJ_RELEASE them in a debug build an assert will fire. This commit adds some protection against that and initializes the _module pointers to NULL.	2016-07-01 13:41:27 -05:00
Joshua Hursey	59f304b9e9	coll/base: neg. priority cleanup, verbose output improvements * Print a verbose message if the component was disqualified because of a negative priority. * If a disqualified component provided a module, release it. * Display list of selected components in priority order - During the process of volunteering collective functions for a communicator, print the component name and priority. This will cause the verbose messages to be displayed in reverse priority order (lowest priority first, up to highest). This is helpful when determining which collective components are active in which order for a given communicator. To see the messages you need the following MCA parameter set to 9 or higher: `-mca coll_base_verbose 9` * Adjust verbose for commonly needed verbose output from 10 to 9 to make it easier to access this information.	2016-07-01 13:41:27 -05:00
Nathan Hjelm	f38cc00df9	Merge pull request #1835 from hjelmn/thread_fix ompi/request: fix hang in ompi_request_wait_completion	2016-06-30 18:50:32 -06:00
Nathan Hjelm	445b79bba8	ompi/request: fix hang in ompi_request_wait_completion This commit fixes a hang reported by @nysal which happens when a request is completed after a sync object is created but before the sync object can be assigned to the request. In this case we need to set the sync signaling field to false to ensure WAIT_SYNC_RELEASE does not hang. Fixes open-mpi/ompi#1828 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-30 09:54:00 -06:00
Nathan Hjelm	5f390b5f5a	bml/r2: be more restrictive on rdma endpoints This commit makes bml/r2 more restrictive on which endpoints end up in the rdma endpoint list. Before this commit an endpoint was added if it supported either put or get. This was done to ensure that endpoints are available for RMA. Thought it is possible to support put or get endpoints we only currently support endpoints that have put, get, and amos. bml/r2 now reflects this support. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2016-06-29 18:54:58 -06:00
Artem Polyakov	541715572f	Fix MPI_Waitany and MPI_Waitsome (request handling related)	2016-06-28 16:40:00 +03:00
Nathaniel Graham	bb9485bcd9	Fix Java Coverity issue Fixing a possible error that Coverity pointed out in ompi_java_exceptionCheck. Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>	2016-06-23 15:09:07 +02:00
Nathan Hjelm	143a93f379	opal/sync: remove usage of OPAL_ENABLE_MULTI_THREADS The OPAL_ENABLE_MULTI_THREADS macro is always defined as 1. This was causing us to always use the multi-thread path for synchronization objects. The code has been updated to use the opal_using_threads() function. When MPI_THREAD_MULTIPLE support is disabled at build time (2.x only) this function is a macro evaluating to false so the compiler will optimize out the MT-path in this case. The OPAL_ATOMIC_ADD_32 macro has been removed and replaced by the existing OPAL_THREAD_ADD32 macro. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-22 09:52:37 -06:00
Nathaniel Graham	679a66ccc8	Merge pull request #1803 from nrgraham23/jni_error_handling Java bindings exception handling fix	2016-06-22 04:14:00 -07:00
Nathan Hjelm	7bd7c0578b	Merge pull request #1807 from hjelmn/request_perfm_regression ompi/request: fix performance regression	2016-06-21 14:47:56 -06:00
Nathan Hjelm	544adb9aed	ompi/request: fix performance regression This commit fixes a performance regression introduced by the request rework. We were always using the multi-thread path because OPAL_ENABLE_MULTI_THREADS is either not defined or always defined to 1 depending on the Open MPI version. To fix this I removed the conditional and added a conditional on opal_using_threads(). This path will be optimized out in 2.0.0 in a non-thread-multiple build as opal_using_threads is #defined to false in that case. Fixes open-mpi/ompi#1806 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-21 11:45:32 -06:00
Nathan Hjelm	2409024c17	osc/rdma: fix typo Need to increment the total size after checking the local offset not before. This typo causes large allocations with MPI_Win_allocate() to fail. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-21 09:50:29 -06:00
Nathaniel Graham	88dea4e4de	Java bindings exception handling fix Fixed an error where if there were no MPI exceptions, a JNI error could still exist and not get handled. Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>	2016-06-21 12:40:31 +02:00
George Bosilca	9c4f56be4b	Fix the coll_base_sendrecv function.	2016-06-18 18:23:51 +02:00
Nathan Hjelm	3a69b727a6	Merge pull request #1788 from hjelmn/split_type comm/split_type: allow MPI_UNDEFINED for split_type	2016-06-16 21:12:25 -06:00
Nathan Hjelm	65be935676	comm/split_type: allow MPI_UNDEFINED for split_type It is valid for any rank to deviate on the split_type argument if they specify MPI_UNDEFINED. The code was incorrectly not allowing this condition. Changed the split_type uniformity check and allow local_size to be 0 if the local split_type is MPI_UNDEFINED. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-16 17:42:28 -06:00
rhc54	702a982271	Merge pull request #1767 from rhc54/topic/pmix2 Enable the PMIx event notification capability	2016-06-16 15:27:43 -07:00
Nathan Hjelm	e135543cb0	Merge pull request #1785 from hjelmn/malloc_hook_fix opal/memory: disable __malloc_initialize_hook if poisoned	2016-06-15 14:55:44 -06:00
Nathan Hjelm	7018aeda2b	opal/memory: disable __malloc_initialize_hook if poisoned Newer versions of gcc have "poisoned" the __malloc_initialize_hook name and it can no longer be used. Added a configure check and protection around its usage. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-06-15 12:00:49 -06:00
KAWASHIMA Takahiro	dff6accec6	ompi/datatype: Fix args of DARRAY According to MPI-3.1 P.122, `ni` for `MPI_COMBINER_DARRAY` should be `4ndims+4`, not `4size+4`. This bug may cause SEGV if `size` is smaller than `ndims` when the darray is used for one-sided communication (pt2pt OSC). This bug was introduced in open-mpi/ompi@79b13f36 (when darray became a first class citizen and the `a_i` index of darray was shifted by 2). The corresponding `MPI_Type_create_darray()` function sets a right value so we don't need to update the function.	2016-06-15 11:24:22 +09:00
Ralph Castain	5d330d5220	Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler. Add PMIx 2.0 Remove PMIx 1.1.4 Cleanup copying of component Add missing file Touchup a typo in the Makefile.am Update the pmix ext114 component Minor cleanups and resync to master Update to latest PMIx 2.x Update to the PMIx event notification branch latest changes	2016-06-14 13:08:41 -07:00
Jeff Squyres	c2185bb4b8	Merge pull request #1781 from jsquyres/pr/disable-psm-psm2-signal-hijacking PSM/PSM2: Disable signal handler hijacking by default	2016-06-14 15:33:24 -04:00
Jeff Squyres	5071602c59	PSM/PSM2: Disable signal handler hijacking by default Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some subsequent phone calls and off-issue email discussions), the PSM library is hijacking signal handlers by default. Specifically: unless the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel TrueScale) is set, the library constructor for this library will hijack various signal handlers for the purpose of invoking its own error reporting mechanisms. This may be a bit surprising, but is not a problem, per se. The real problem is that older versions of at least the PSM library do not unregister these signal handlers upon being unloaded from memory. Hence, a segv can actually result in a double segv (i.e., the original segv and then another segv when the now-non-existent signal handler is invoked). This PSM signal hijacking subverts Open MPI's own signal reporting mechanism, which may be a bit surprising for some users (particularly those who do not have Intel TrueScale). As such, we disable it by default so that Open MPI's own error-reporting mechanisms are used. Additionally, there is a typo in the library destructor for the PSM2 library that may cause problems in the unloading of its signal handlers. This problem can be avoided by setting `HFI_NO_BACKTRACE=1` (for PSM2 / Intel OmniPath). This is further compounded by the fact that the PSM / PSM2 libraries can be loaded by the OFI MTL and the usNIC BTL (because they are loaded by libfabric), even when there is no Intel networking hardware present. Having the PSM/PSM2 libraries behave this way when no Intel hardware is present is clearly undesirable (and is likely to be fixed in future releases of the PSM/PSM2 libraries). This commit sets the following two environment variables to disable this behavior from the PSM/PSM2 libraries (if they are not already set): * IPATH_NO_BACKTRACE=1 * HFI_NO_BACKTRACE=1 If the user has set these variables before invoking Open MPI, we will not override their values (i.e., their preferences will be honored). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-06-14 11:45:23 -07:00
Edgar Gabriel	1ddfd6cdca	io/ompio: fix the preallocate function handle preallocating sizes less than the current file size correctly.	2016-06-14 10:50:32 -05:00
KAWASHIMA Takahiro	84b110a1f2	ompi/datatype: Fix args of HINDEXED_BLOCK According to MPI-3.1 P.121, `ni` for `MPI_COMBINER_HINDEXED_BLOCK` should be `2`, not `2 + count`. This bug was introduced in `113b45b4` (when `MPI_Type_create_hindexed_block` support is added in Open MPI) and fixed partially in `7f5314ee` and `8de93982`. This commit fixes the remaining part. Probably this bug has no user impact. It only consumes a bit more memory.	2016-06-10 17:32:33 +09:00
Gilles Gouaillardet	80e362de52	coll/base: fix memory free in ompi_coll_base_allreduce_intra_recursivedoubling err handler Fix CID 1362630 Fixes open-mpi/ompi@0e393195d9	2016-06-09 13:12:25 +09:00
Gilles Gouaillardet	ead7efef3f	coll/basic: silence CID 1362614 in mca_coll_basic_allreduce_inter()	2016-06-09 09:40:19 +09:00
Gilles Gouaillardet	ad2e1a5ae9	coll/base: silence CID 1362613 in ompi_coll_base_alltoall_intra_basic_linear()	2016-06-09 09:40:05 +09:00
Gilles Gouaillardet	80b267af1c	coll/base: silence CID 1362601 in ompi_coll_base_sendrecv_zero()	2016-06-09 09:37:31 +09:00
Gilles Gouaillardet	0e393195d9	coll/base: fix [all]reduce with non zero lower bound datatypes Offset temporary buffer when a non zero lower bound datatype is used. Thanks Hristo Iliev for the report	2016-06-08 16:48:00 +09:00
Nathan Hjelm	97c1643216	Merge pull request #1766 from hjelmn/req_fix ompi/request: fix loop conditional	2016-06-07 12:11:56 -06:00
Nathan Hjelm	3ddf3ccbf3	Merge pull request #1758 from hjelmn/ob1_fixes pml/ob1: bug fixes	2016-06-07 11:18:55 -06:00

1 2 3 4 5 ...

9050 Коммитов