openmpi

Автор	SHA1	Сообщение	Дата
Sergey Oblomov	b0f87f2235	PML/UCX: blocked calls optimizations - added UCX progress priority Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-08-27 09:50:38 +03:00
Sergey Oblomov	b72dd83f05	MCA/COMMON/UCX: added synonims for common ucx variables - added synonims for atomic/osc modules Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-08-26 18:25:21 +03:00
Jeff Squyres	fe0852bcb4	Miscellaneous compiler warning stomps. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-08-24 07:39:14 -07:00
Nathan Hjelm	feb0e90301	Merge pull request #5589 from hjelmn/threads_cleanup config: remove OPAL_ENABLE_MULTI_THREADS config macro	2018-08-23 15:43:13 -06:00
Nathan Hjelm	d0cd80e902	osc/rdma: clean out stale aggregation code The aggregation code in osc/rdma is currently broken and will likely not be reused. This commit cleans it out. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-23 15:40:21 -06:00
Aravind Gopalakrishnan	5cbcae79d8	MTL OFI: Ask for FI_THREAD_DOMAIN support when not using MPI_THREAD_MULTIPLE When an application is not using multiple threads to call into MPI, we can safely ask for FI_THREAD_DOMAIN setting from the provider as it should translate to the least amount of locking in provider. Conversely, for applications using THREAD_MULTIPLE, explicitly ask for FI_THREAD_SAFE to prevent race conditions. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2018-08-23 14:18:32 -07:00
Nathan Hjelm	1c84f48640	config: remove OPAL_ENABLE_MULTI_THREADS config macro We long ago hard-coded this value to 1. This commit cleans it out entirely. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-23 13:47:02 -06:00
Ralph Castain	f7655280cb	Merge pull request #5503 from aravindksg/aravindksg/fix_ofi_race MTL OFI: Fix race condition due to global progress entries array	2018-08-22 14:31:38 -07:00
Nathan Hjelm	29320872b3	osc/rdma: quiet warning gcc complains about ret possibly being used uninitialized. That will never happen but we should still quiet the warning. This commit sets ret to a valid value. Fixes #5513 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-21 15:54:53 -06:00
Nathan Hjelm	438c40de03	osc/pt2pt: use c99 for module initialization Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-21 11:23:33 -06:00
Sergey Oblomov	e00f7a68ba	MCA/COMMON/UCX: added synonim to opal_mem_hook variable - added synonim to opal_mem_hook variable to allow to print it in opal_info -a Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-08-21 15:05:12 +03:00
Edgar Gabriel	e6a344ba63	Merge pull request #5561 from edgargabriel/pr/file_open_sharedfp_ordering common/ompio: fix an ordering problem during file_open	2018-08-20 10:18:14 -05:00
Edgar Gabriel	eaabfdd028	Merge pull request #5539 from DDNStorage/ime-support ompio: support for DDN's Infinite Memory Engine	2018-08-20 09:52:22 -05:00
Edgar Gabriel	2742273ee3	common/ompio: fix an ordering problem during file_open the sharedfp component has to be selected and opened before we set the default file view during file_open. Otherwise there is a sperious error message from the sharefp_file_seek operation that is called during the file_set_view. Fixes Issue #5560 Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>	2018-08-20 09:28:29 -05:00
Jeff Squyres	8a0b5454ae	fortran/use TKR: remove excess declaration for PMPI_Type_extent This declaration was accidentally left behind in 89da9651bb2fe. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-08-16 10:31:41 -07:00
Gaëtan Bossu	ccc96efc2e	DDN's Infinite Memory Engine support for OMPIO Changes made: - Create a new fs component for IME - Create a new fbtl component for IME - Modify the close function of OMPIO to finalize IME if necessary Signed-off-by: Gaëtan Bossu <gbossu@ddn.com> Signed-off-by: Sylvain Didelot <sdidelot@ddn.com>	2018-08-16 11:45:47 +02:00
Aravind Gopalakrishnan	ed2343034d	MTL OFI: Fix race condition due to global progress entries array Since progress entries array is globally allocated, it is susceptible to race conditions when using multi-threaded applications. Allocating it on the stack resolves any potential races as it is thread local by default. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2018-08-09 10:52:28 -07:00
Jeff Squyres	89773c41a2	Fix script abstraction break: mv make_manpage.pl to config Having the "make_manpage.pl" script in the ompi/ tree broke "./autogen.pl --no-ompi" (specifically: "make distcheck" of --no-ompi builds would break). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2018-08-08 08:50:55 -07:00
Todd Kordenbrock	e9f378e851	Merge pull request #5500 from tkordenbrock/topic/master/fix.PtlMEUnlink.in.use coll-portals4: retry PtlMEUnlink() if PTL_IN_USE	2018-08-07 11:21:00 -05:00
Nathan Hjelm	c294bbc352	Merge pull request #5508 from hjelmn/fuzzy_match Bring fuzzy matching support into master	2018-08-06 13:52:04 -06:00
Nathan Hjelm	eeae3f9b93	Merge pull request #5517 from bosilca/topic/treematch_warnings Remove few warnings identified by @rhc in #5514.	2018-08-06 13:25:07 -06:00
Matthew Dosanjh	c8d13486cc	Fixed promotion bug Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-06 12:56:36 -06:00
Boris Karasev	57683366ca	pmix: added check for pmix fence status Signed-off-by: Boris Karasev <karasev.b@gmail.com>	2018-08-06 15:01:57 +06:00
George Bosilca	6d11a45f44	Remove few warnings identified by @rhc in #5514 . Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-08-03 16:21:06 -04:00
George Bosilca	a5fbfa476a	Be conservative with the array_of_indices We were assuming that the array_of_indices has the same size as the number of requests (incount), instead of the numberr of actually active requests. While the patch is trivial, the question of the size of the array_of_indices should be clarified in the MPI Forum. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2018-08-03 14:58:13 -04:00
Nathan Hjelm	dd74c6252f	pml/ob1: custom matching cleanup and configury This commit updates the new custom matching code in pml/ob1 so it can not be enabled with a configure option. This commit also renames the fuzzy-matching headers to avoid potential name conflicts and removes the use of C reserved identifiers. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-02 13:06:19 -06:00
Matthew Dosanjh	572694b621	Adding custom match source. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-08-02 12:23:08 -06:00
Ralph Castain	1aef0a64aa	Merge pull request #5477 from nrspruit/ns_mtl_send_isend MTL OFI: send/isend split into blocking/non-blocking paths	2018-07-31 13:08:37 -07:00
Ralph Castain	8744320a18	Merge pull request #5476 from nrspruit/ns_cancel_fix MTL OFI: Fix Deadlock in fi_cancel given completion during cancel	2018-07-31 13:07:41 -07:00
Todd Kordenbrock	f3f2a826b4	coll-portals4: retry PtlMEUnlink() if PTL_IN_USE In the cleanup phase, it is possible for PtlMEUnlink() to return PTL_IN_USE if the NIC is not done with the ME. This should not be considered an error. This commit adds a retry loop around PtlMEUnlink(). In some cases, the return value of PtlMEUnlink() and PtlCTFree() was not checked at all. Check them with the same retry loop as above. Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2018-07-31 10:20:55 -05:00
Mark Allen	f413ef6b14	apply romio314 patch to romio321 When romio314 was first pulled in an extra patch was applied to it, see commit 92f6c7c1e210c559471a05aaac9b19e0bd3d71bb. Most of that patch is already present in vanilla romio321, but the fix for MPIO_DATATYPE_ISCOMMITTED() isn't. If that macro doesn't set err_ then some paths end up with a variable being used uninitialized. In particular you can trace through romio321/romio/mpi-io/read.c to see what happens with error_code. It's an uninitialized stack variable that goes through three MPIO_CHECK_* macros none of which set it. The macros consistently set error_code to a failure if they see something wrong, but they don't consistently set it to success when things are fine. And then in the last macro MPIO_CHECK_DATATYPE it tries to look at the value of error_code that was never set. Signed-off-by: Mark Allen <markalle@us.ibm.com>	2018-07-30 17:14:56 -04:00
Sergey Oblomov	d204b8a678	PML/SPML/UCX/COMPONENT: applied C99 initialization Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-28 09:44:03 +03:00
Mikhail Kurnosov	b45e190e66	coll/base/allgatherv: fix MPI_IN_PLACE processing The call of MPI_Allgatherv with sendbuf and sendtype parameters equal to MPI_IN_PLACE and NULL correspondingly, produces the segmentation fault. The problem is that sendtype is used even when sendbuf value is MPI_IN_PLACE. But according to the standard, sendtype and sendcount parameters should be ignored in this case. Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-07-27 09:34:17 +07:00
Sergey Oblomov	2806504290	PML/SPML/UCX: init global objects using C99 style - to avoid value mix used C99 style of object initializations Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-25 14:52:45 +03:00
Spruit, Neil R	7dc8c8ba3f	MTL OFI: send/isend split into blocking/non-blocking paths -Updated blocking send to directly call functionality and set completion events expected to 0 initally. This allows for optimization for providers that support fi_tinject up to larger sizes. This also reduces latency on running the OFI mtl with smaller sizes without requiring calls to progress given fi_tinject is required to complete the messaging before returning and will not create any events in the Completion Queue. -Updated non-blocking send to directly call fi_tsend and avoid calling fi_tinject as the functionality should not wait on completions. This resolves a bug where applications calling MPI_Isend can overrun the TX buffer with small (inject) messages causing a deadlock. In addition this improves performance in message rates by preventing waiting on any size message to complete in non-blocking send messages. -Created common ompi_mtl_ofi_ssend_recv function to post the ssend recv which is common between isend and send code paths. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>	2018-07-24 07:54:24 -07:00
Spruit, Neil R	767135c580	MTL OFI: Fix Deadlock in fi_cancel given completion during cancel - If a message for a recv that is being cancelled gets completed after the call to fi_cancel, then the OFI mtl will enter a deadlock state waiting for ofi_req->super.ompi_req->req_status._cancelled which will never happen since the recv was successfully finished. - To resolve this issue, the OFI mtl now checks ofi_req->req_started to see if the request has been started within the loop waiting for the event to be cancelled. If the request is being completed, then the loop is broken and fi_cancel exits setting ofi_req->super.ompi_req->req_status._cancelled = false; Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>	2018-07-24 03:12:44 -07:00
Matias Cabral	d996f529c0	MTL OFI: Add support for mem_tag_format OFI providers may reserve some of the upper bits of the tag for internal usage and expose it using mem_tag_format. Check for that and adjust communicator bits as needed. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2018-07-23 11:39:40 -07:00
Matias Cabral	30fb635836	Merge pull request #5446 from nrspruit/ns_mtl_ofi_overflow MTL OFI: MTL_OFI_RETRY_UNTIL_DONE support for Resource overflow	2018-07-20 14:53:53 -07:00
Sergey Oblomov	6fe0a73861	PML/UCX: fixed ucp request free on persistent request completion - in sine cases persistent request was deleted during completion callback, this cause double free of linked UCX request (assert in debug build or hang in release build) - UCX request is freed prior completion calback Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-20 19:32:20 +03:00
Yossi Itigin	bdb6ece3dd	Merge pull request #5452 from hoopoepg/topic/osc-ucx-fox-hang OSC/UCX: fixed hang on OSC init	2018-07-19 13:57:51 +03:00
Sergey Oblomov	fa33e322e7	OSC/UCX: code deduplication Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-19 12:39:15 +03:00
Sergey Oblomov	6f0a7a2005	OSC/UCX: opal progress register/unregister optimization Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-19 12:07:26 +03:00
Yossi Itigin	29812494f2	Merge pull request #5402 from hoopoepg/topic/common-del-procs MCA/COMMON/UCX: del_procs calls are unified to common module	2018-07-19 11:19:45 +03:00
Sergey Oblomov	55b934bacf	OSC/UCX: enable progress when at least one window is allocated Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-18 17:52:30 +03:00
Sergey Oblomov	a081fba046	OSC/UCX: fixed hang on OSC init - there worked progress was missed on startup which caused hang on one of ranks Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-18 17:01:53 +03:00
Edgar Gabriel	b6b9552ca9	Merge pull request #5444 from gbossu/fix-file-delete io/ompio: Call component-specific file_delete function instead of POSIX unlink	2018-07-18 08:45:57 -05:00
Sergey Oblomov	920cc2e0d9	MCA/COMMON/UCX: del_procs calls are unified to common module Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>	2018-07-18 07:37:25 +03:00
Mikhail Kurnosov	540c2d1617	coll-base-allgather: fix MPI_IN_PLACE processing The call of MPI_Allgather with sendbuf and sendtype parameters equal to MPI_IN_PLACE and NULL correspondingly, produces the segmentation fault. The problem is that sendtype is used even when sendbuf value is MPI_IN_PLACE. But according to the standard, sendtype and sendcount parameters should be ignored in this case. Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>	2018-07-18 10:27:00 +07:00
Gilles Gouaillardet	fed1e7766e	Merge pull request #5430 from ggouaillardet/pr/pcollreq-fort mpiext/pcollreq: add Fortran bindings	2018-07-18 09:52:59 +09:00
Joshua Ladd	3add13c72e	Merge pull request #5441 from hoopoepg/topic/ucx-memhooks-to-common-module MCA/COMMON/UCX: shift opal memhooks into common UCX	2018-07-17 15:52:44 -04:00

1 2 3 4 5 ...

10244 Коммитов