openmpi

Автор	SHA1	Сообщение	Дата
Geoff Paulsen	de0c595ca5	Merge pull request #5650 from matcabral/remove_psm2_shadow_env_40x v4.0.x: MTL PSM2: Remove shadow variables from v4.0.x	2018-09-17 14:40:59 -05:00
Gilles Gouaillardet	080e20fa02	mtl/psm2: fix a misc memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp> (cherry picked from commit open-mpi/ompi@316e4e38f4)	2018-09-10 09:17:54 +09:00
matcabral	8fa172e60b	MTL PSM2: Remove shadow variables from v4.0.x As agreed on #4574, where removed in past release branches to avoid perfomance impacts in the default values for some paramters. Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2018-09-05 18:44:40 -04:00
Aravind Gopalakrishnan	37d1a202be	MTL OFI: Fix race condition due to global progress entries array Since progress entries array is globally allocated, it is susceptible to race conditions when using multi-threaded applications. Allocating it on the stack resolves any potential races as it is thread local by default. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com> (cherry picked from commit `ed2343034d`)	2018-08-28 14:23:56 -07:00
Howard Pritchard	7b6a2da71a	Merge pull request #5504 from rhc54/cmr40/ofi MTL OFI: send/isend split into blocking/non-blocking paths	2018-08-13 14:18:05 -06:00
Howard Pritchard	9a6f6e61f0	Merge pull request #5499 from nrspruit/ns_cancel_fix_4.0 MTL OFI: Fix Deadlock in fi_cancel given completion during cancel	2018-08-07 09:16:56 -06:00
Spruit, Neil R	1fbbae1907	MTL OFI: send/isend split into blocking/non-blocking paths -Updated blocking send to directly call functionality and set completion events expected to 0 initally. This allows for optimization for providers that support fi_tinject up to larger sizes. This also reduces latency on running the OFI mtl with smaller sizes without requiring calls to progress given fi_tinject is required to complete the messaging before returning and will not create any events in the Completion Queue. -Updated non-blocking send to directly call fi_tsend and avoid calling fi_tinject as the functionality should not wait on completions. This resolves a bug where applications calling MPI_Isend can overrun the TX buffer with small (inject) messages causing a deadlock. In addition this improves performance in message rates by preventing waiting on any size message to complete in non-blocking send messages. -Created common ompi_mtl_ofi_ssend_recv function to post the ssend recv which is common between isend and send code paths. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com> (cherry picked from commit `7dc8c8ba3f`)	2018-08-01 06:45:48 -07:00
Spruit, Neil R	9cc6bc1ea6	MTL OFI: Fix Deadlock in fi_cancel given completion during cancel - If a message for a recv that is being cancelled gets completed after the call to fi_cancel, then the OFI mtl will enter a deadlock state waiting for ofi_req->super.ompi_req->req_status._cancelled which will never happen since the recv was successfully finished. - To resolve this issue, the OFI mtl now checks ofi_req->req_started to see if the request has been started within the loop waiting for the event to be cancelled. If the request is being completed, then the loop is broken and fi_cancel exits setting ofi_req->super.ompi_req->req_status._cancelled = false; Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com> (cherry picked from commit `767135c580`)	2018-07-30 07:17:40 -07:00
Spruit, Neil R	ac8d2e01f9	MTL OFI: MTL_OFI_RETRY_UNTIL_DONE support for Resource overflow - Added support in MTL_OFI_RETRY_UNTIL_DONE to handle -FI_EAGAIN from the provider and correctly attempt to progress the OFI Completion queue by calling ompi_mtl_ofi_progress. - If events were pending that blocked OFI operations from being enqueued they will be completed and the OFI operation will be retried once ompi_mtl_ofi_progress has successfully completed. - Updated MTL_OFI_RETRY_UNTIL_DONE to take a RETURN variable instead of requiring the existance of a "ret" variable to pass back the return value from completing the OFI operation. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com> (cherry picked from commit `d4f408a7f8`)	2018-07-23 11:14:42 -07:00
Spruit, Neil R	9a17864278	MTL OFI: Redesign sync send with reduced tag bits and quick ack -Updated the design for sync send MPI calls to use 2 protocol bits for denoting "sync_send" or "sync_send_ack". -"Sync_send" is added to the send tag only and is masked out in receives such that it can be read by the original Recv posted in the send/recv operation. -"Sync_send_ack" is sent from the recv callback to the send side. This 0 byte send does not generate a completion entry and instead sends the message and immediately completes the opal completion in the recv. -Tag formats ofi_tag_1 and ofi_tag_2 have been updated to include 2 more tag bits per format type due to the reduced protocal bits required by OMPI. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>	2018-07-09 06:50:21 -07:00
Matias A Cabral	e6674556aa	MTL OFI: add support for FI_REMOTE_CQ_DATA. Extend number of supported ranks with providers that support FI_REMOTE_CQ_DATA. Add README file to OFI MTL Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2018-06-14 17:17:38 -07:00
Brian Barrett	09e4c40ce9	mtl: remove MXM MTL Remove the MXM MTL, which has been deprecated in preference for the Yalla PML. This was discussed at the last developers meeting and somehow I ended up with the action item to do the removal. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-05-21 14:18:30 -07:00
Nathan Hjelm	f432d07844	mtl: reset ompi_mtl_base_selected_component on framework close Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2018-05-02 14:53:34 -06:00
Todd Kordenbrock	d646a00cd9	Merge pull request #5054 from tkordenbrock/topic/master/mtl-portals4.finalize.fix master: mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized	2018-04-12 12:12:05 -05:00
Todd Kordenbrock	90659671bc	mtl-portals4: don't call progress() in finalize() if Portals4 was not initialized This commit fixes a segfault in mtl-portals4 finalize(). The segfault occurs if finalize() is called without any calls to add_procs(). This commit resolves the segfault by skipping the progress() loop in finalize() if the Portals was not initialized. Signed-off-by: Todd Kordenbrock (thkgcode@gmail.com)	2018-04-10 14:22:32 -05:00
Spruit, Neil R	e7bff501cd	MTL OFI: Added support for reading multiple CQ events in ofi progress -Updated ompi_mtl_ofi_progress to use an array to read CQ events up to a threshold that can be set by the Open MPI User. -Users can adjust the number of events that can be handled in the ompi_mtl_ofi_progress by setting "--mca mtl_ofi_progress_event_cnt #". -The default value for the the number of CQ events that can be read in a single call to ofi progress is 100 which is an average based off workload usecase anaylsis showing 70-128 as the range of multiple events returned during ofi progress. Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>	2018-02-15 09:41:14 -05:00
Aravind Gopalakrishnan	fb68726baf	MTL OFI: Allow retries in MTL progress for interrupted syscalls This fixes a regression in sockets provider which could return -EINTR value from fi_cq_read() due to a syscall being interrupted. The error value is currently interpreted as fatal condition. Relax the rule so that we can retry fi_cq_read() operation. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-12-20 14:58:49 -08:00
Matias Cabral	2c86b8723d	Merge pull request #4510 from matcabral/mtl_psm2_shadow_vars New flag for MCA parameters that allows a behaving with a default value of "unset".	2017-12-04 12:25:37 -08:00
Howard Pritchard	b160cf6339	Merge pull request #4533 from hppritcha/topic/ofi_mtl_mprobe_fixes mtl/ofi: fix problem with mprobe/mrecv	2017-12-04 09:11:47 -07:00
Nathan Hjelm	1282e98a01	opal/asm: rename existing arithmetic atomic functions This commit renames the arithmetic atomic operations in opal to indicate that they return the new value not the old value. This naming differentiates these routines from new functions that return the old value. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-11-30 10:41:22 -07:00
Nathan Hjelm	9d0b3fe9f4	opal/asm: remove opal_atomic_bool_cmpset functions This commit eliminates the old opal_atomic_bool_cmpset functions. They have been replaced by the opal_atomic_compare_exchange_strong functions. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-11-30 10:41:22 -07:00
Howard Pritchard	cd48eccbae	mtl/ofi: fix problem with mprobe/mrecv At least with some providers (sockets and GNI), the mprobe/mrecv ofi mtl methods were incorrect. For these two providers at least one must supply the original tag and mask bits used with the prior FI_PEEK \| FI_CLAIM request that had been used to probe for the message. These providers take a strict interpretation of the following sentence from the libfabric fi_tagged man page: ``` Claimed messages can only be retrieved using a subsequent, paired receive operation with the FI_CLAIM flag set. ``` Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-11-24 08:11:18 -07:00
Matias A Cabral	1fad59465f	New flag for MCA parameters that allows a behaving with a default value of "unset". mtl/psm2: Update some shadow mca parameters to use the default "unset". mtl/psm2: Add new shadow parameter to allow specifying the service level. Signed-off-by: Matias A Cabral <matias.a.cabral@intel.com>	2017-11-16 16:28:50 -08:00
Matias Cabral	d1869a725a	Merge pull request #4467 from matcabral/master mtl/ofi: Set data and control progress options default values to FI_PROGRESS_UNSPEC	2017-11-13 07:35:39 -08:00
Jeff Squyres	a8686a6813	mtl ofi: squelch compiler warnings gcc 5.2 complains: ``` mtl_ofi_component.c: In function ‘ompi_mtl_ofi_finalize’: mtl_ofi_component.c:613:5: warning: suggest parentheses around assignment used as truth value [-Wparentheses] if (ret = fi_close((fid_t)ompi_mtl_ofi.fabric)) { ^ ``` Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-11-11 05:07:11 -08:00
Jeff Squyres	5a6ddf42d6	mtl ofi: it is not an error to return no data from fi_getinfo() Before this commit, the presence of usNIC devices -- which will (currently) return no data when fi_getinfo() is queried for tagged matching providers -- would cause an error message to be displayed. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-11-11 05:07:11 -08:00
Jeff Squyres	f910f554f7	mtl ofi: show the positive value of the error The value of ret is negative (e.g., -61), but it is displayed in the help message as `%zd`, which renders as unsigned (i.e., a giant positive value). So make sure to negate the negative value before rendering it (e.g., so we display "61", not "4294967235"). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-11-11 05:07:11 -08:00
Jeff Squyres	e8c13ef286	mtl ofi: fix trivial comment whitespace No code or logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-11-11 05:07:10 -08:00
Jeff Squyres	bed1930df8	mtl ofi: fix formatting of help message No code or logic changes. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-11-11 05:07:05 -08:00
Matias Cabral	b76bb42ac1	mtl/ofi: Set data and control progress options default values to FI_PROGRESS_UNSPEC so each provider will use its default. Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2017-11-08 08:24:33 -08:00
bosilca	63e8a8c608	Merge pull request #4431 from hjelmn/asm_cleanup opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset*	2017-11-02 18:45:56 -04:00
Nathan Hjelm	3ff34af355	opal: rename opal_atomic_cmpset* to opal_atomic_bool_cmpset* This commit renames the atomic compare-and-swap functions to indicate the return value. This is in preperation for adding support for a compare-and-swap that returns the old value. At the same time the return type has been changed to bool. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-10-31 12:47:23 -06:00
Aravind Gopalakrishnan	285fc42b4e	Fix OFI MTL to recognize correct CQ empty scenario Currently, the progress function is incorrectly interpreting any error value other than a positive value or -FI_EAVAIL to mean CQ is empty. CQ is empty only if fi_cq_read() call returned -EAGAIN error code. Fix that here. While at it, fix help text output for calls made to OFI API. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-10-30 12:13:44 -07:00
Aravind Gopalakrishnan	bea4503f95	Move help text output regarding PSM2_CUDA envvar to component init phase The messages should be printed only in the event of CUDA builds and in the presence of supporting hardware and when PSM2 MTL has actually been selected for use. To this end, move help text output to component init phase. Also use opal_setenv/unsetenv() for safer setting, unsetting of the environment variable and sanitize the help text message. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-10-26 16:01:01 -07:00
Matias Cabral	b81bcd4b0d	MTL PSM2: add a thread lock while peeking and completing the psm2 requests. Reviewed-by: Gopalakrishnan, Aravind <aravind.gopalakrishnan@intel.com> Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2017-10-20 14:46:48 -07:00
Aravind Gopalakrishnan	f8a2b7f6bf	Use opal_show_help to warn about PSM2_CUDA envvar setting If Open MPI is configured with CUDA, then user also should be using a CUDA build of PSM2 and therefore be setting PSM2_CUDA environment variable to 1 while using CUDA buffers for transfers. If we detect this setting to be missing, force set it. If user wants to use this build for regular (Host buffer) transfers, we allow the option of setting PSM2_CUDA=0, but print a warning message to user that it is not a recommended usage scenario. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-29 17:04:10 -07:00
yohann	1f8cabc890	mtl/ofi: Fix provider selection. This allows mtl_ofi_provider_include to work with layered providers as well. e.g. --mca mtl_ofi_provider_include "providerX;ofi_rxm" Signed-off-by: yohann <yohann.burette@intel.com>	2017-09-20 16:00:50 -07:00
Aravind Gopalakrishnan	2e83cf15ce	Add support for GPU buffers for PSM2 MTL PSM2 enables support for GPU buffers and CUDA managed memory and it can directly recognize GPU buffers, handle copies between HFIs and GPUs. Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases. In this patch, we allow the PSM2 MTL to specify when it does not require CUDA convertor support. This allows us to skip CUDA convertor init phases and lets PSM2 handle the memory transfers. This translates to improvements in latency. The patch enables blocking collectives and workloads with GPU contiguous, GPU non-contiguous memory. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-01 16:59:03 -07:00
Joshua Hursey	e1d079544b	mca: Dynamic components link against project lib * Resolves #3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-08-24 11:56:16 -04:00
Howard Pritchard	701a1d0218	mtl/psm2: add pvar support for PSM2 MQ stats Add pvars for PSM2 MQ stats to help in analyzing performance of Omnipath. Tested (modestly) using modified OSU pt2pt benchmarks. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-07-14 10:31:35 -06:00
Ryan Grant	0ce8590e7c	Merge pull request #3837 from tkordenbrock/topic/master/get.retry.timeout master: mtl-portals4: add timeout to rendezvous get fragments	2017-07-13 09:59:54 -06:00
Nathan Hjelm	6fb81f20e4	mtl/psm2: create mca variables to shadow PSM2 environment variables This commit enables MCA support for the following PSM2 environment variables: PSM2_DEVICES, PSM2_MEMORY, PSM2_MQ_SENDREQS_MAX, PSM2_MQ_RECVREQS_MAX, PSM2_MQ_RNDV_HFI_THRESH, PSM2_MQ_RNDV_SHM_THRESH, PSM2_RCVTHREAD, PSM2_SHAREDCONTEXTS, PSM2_SHAREDCONTEXTS_MAX, and PSM2_TRACEMASK. These variable can be set by MCA if they are not already set in the environment. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-07-13 09:48:46 -06:00
Todd Kordenbrock	5ecd905358	mtl/portals4: move opal_timer_base_get_usec() out of the fast path Rearrange the receive frag timeout logic to avoid calling opal_timer_base_get_usec() in read_msg(). Instead set it at the first retry. Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2017-07-09 22:12:45 -05:00
Todd Kordenbrock	37766d770d	mtl/portals4: if frag retry fails, then fail the entire receive If the a frag cannot be retried because the ni_fail_type is other than PTL_NI_DROPPED, then set the return type and jump to callback_error. This sets MPI_ERROR and completes the receive. Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2017-07-09 22:12:31 -05:00
Piotr Lesnicki	99453e6b10	mtl/portals4: get retransmission REPLY code Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2017-07-09 22:12:25 -05:00
Piotr Lesnicki	06b15cebbf	mtl/portals4: add timeout to get retransmit Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2017-07-09 22:12:08 -05:00
Todd Kordenbrock	27ee862964	mtl-portals4: in rendezvous, reissue PtlGet() if it fails This commit fixes a race condition in the rendezvous protocol. The race occurs because the sender does not wait for the link event on the send buffer. Even though this has not been seen in the wild, it is possible for the receiver to issue the PtlGet() before the ME is linked which causes a NAK at the receiver. This commit resolves this race by reissuing the PtlGet() when a NAK occurs. Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>	2017-05-15 13:11:13 -05:00
Matias A Cabral	644641d06f	PSM and PSM2 MTLs check on the max message size allowed by API. OMPI send and receive mesages use size_t for the lenght while PSM and PSM2 psm(2)mq_send/receive use uint32_t. Type size_t is 64 bits in 64 bits arch. Therefore, this patch adds a sanity check on the lenght of the message and fails gracefully. Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2017-05-10 12:45:11 -07:00
Howard Pritchard	841192645b	common/libfabric: move libfabric to ofi This PR renames the common library for OFI libfabric from libfabric to ofi. There are a number of reasons this is good to do: 1) its shorter and replaces 9 characters with three for function names for what may eventually be a fairly extensive interface 2) OFI is the term used for MTL and RML components that use the OFI libfabric interface 3) A planned OSC component will also use the OFI term. 4) Other HPC libraries that can use OFI libfabric tend to use the term "ofi" internally and also in their configure options relevant to OFI libfabric (i.e. MPICH/CH4, Intel MPI, Sandia SHMEM) There seem to be comments in places in the Open MPI source code that indicate that this common library will be going away. Far from it as we will want to be able to share things like AV objects between OMPI and possibly OSHMEM components that use the OFI libfabric interface. This PR also adds a synonym to the --with-libfabric(-libdir) configury options: --with-ofi and with-ofi-libdir. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-04-20 13:07:16 -06:00
Yossi Itigin	33471c44ee	pml_yalla/mtl_mxm/hcoll: open memory component to activate memory hooks. Memory hooks are now set-up on demand. pml/yalla, mtl/mxm and coll/hcoll need the memory hooks, so make sure those are installed. Signed-off-by: Yossi Itigin <yosefe@mellanox.com>	2017-03-01 12:12:20 +02:00

1 2 3 4 5 ...

532 Коммитов