openmpi

Автор	SHA1	Сообщение	Дата
Brian Barrett	e9e4d2a4bc	Handle asprintf errors with opal_asprintf wrapper The Open MPI code base assumed that asprintf always behaved like the FreeBSD variant, where ptr is set to NULL on error. However, the C standard (and Linux) only guarantee that the return code will be -1 on error and leave ptr undefined. Rather than fix all the usage in the code, we use opal_asprintf() wrapper instead, which guarantees the BSD-like behavior of ptr always being set to NULL. In addition to being correct, this will fix many, many warnings in the Open MPI code base. Signed-off-by: Brian Barrett <bbarrett@amazon.com>	2018-10-08 16:43:53 -07:00
Gilles Gouaillardet	316e4e38f4	mtl/psm2: fix a misc memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2018-08-30 10:07:17 +09:00
Matias A Cabral	1fad59465f	New flag for MCA parameters that allows a behaving with a default value of "unset". mtl/psm2: Update some shadow mca parameters to use the default "unset". mtl/psm2: Add new shadow parameter to allow specifying the service level. Signed-off-by: Matias A Cabral <matias.a.cabral@intel.com>	2017-11-16 16:28:50 -08:00
Aravind Gopalakrishnan	bea4503f95	Move help text output regarding PSM2_CUDA envvar to component init phase The messages should be printed only in the event of CUDA builds and in the presence of supporting hardware and when PSM2 MTL has actually been selected for use. To this end, move help text output to component init phase. Also use opal_setenv/unsetenv() for safer setting, unsetting of the environment variable and sanitize the help text message. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-10-26 16:01:01 -07:00
Matias Cabral	b81bcd4b0d	MTL PSM2: add a thread lock while peeking and completing the psm2 requests. Reviewed-by: Gopalakrishnan, Aravind <aravind.gopalakrishnan@intel.com> Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2017-10-20 14:46:48 -07:00
Aravind Gopalakrishnan	f8a2b7f6bf	Use opal_show_help to warn about PSM2_CUDA envvar setting If Open MPI is configured with CUDA, then user also should be using a CUDA build of PSM2 and therefore be setting PSM2_CUDA environment variable to 1 while using CUDA buffers for transfers. If we detect this setting to be missing, force set it. If user wants to use this build for regular (Host buffer) transfers, we allow the option of setting PSM2_CUDA=0, but print a warning message to user that it is not a recommended usage scenario. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-29 17:04:10 -07:00
Aravind Gopalakrishnan	2e83cf15ce	Add support for GPU buffers for PSM2 MTL PSM2 enables support for GPU buffers and CUDA managed memory and it can directly recognize GPU buffers, handle copies between HFIs and GPUs. Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases. In this patch, we allow the PSM2 MTL to specify when it does not require CUDA convertor support. This allows us to skip CUDA convertor init phases and lets PSM2 handle the memory transfers. This translates to improvements in latency. The patch enables blocking collectives and workloads with GPU contiguous, GPU non-contiguous memory. Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>	2017-09-01 16:59:03 -07:00
Joshua Hursey	e1d079544b	mca: Dynamic components link against project lib * Resolves #3705 * Components should link against the project level library to better support `dlopen` with `RTLD_LOCAL`. * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am` with the appropriate project level library: ``` MCA components in ompi/ $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la MCA components in orte/ $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la MCA components in opal/ $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la MCA components in oshmem/ $(top_builddir)/oshmem/liboshmem.la" ``` Note: The changes in this commit were automated by the script in the commit that proceeds it with the `libadd_mca_comp_update.py` script. Some components were not included in this change because they are statically built only. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-08-24 11:56:16 -04:00
Howard Pritchard	701a1d0218	mtl/psm2: add pvar support for PSM2 MQ stats Add pvars for PSM2 MQ stats to help in analyzing performance of Omnipath. Tested (modestly) using modified OSU pt2pt benchmarks. Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-07-14 10:31:35 -06:00
Nathan Hjelm	6fb81f20e4	mtl/psm2: create mca variables to shadow PSM2 environment variables This commit enables MCA support for the following PSM2 environment variables: PSM2_DEVICES, PSM2_MEMORY, PSM2_MQ_SENDREQS_MAX, PSM2_MQ_RECVREQS_MAX, PSM2_MQ_RNDV_HFI_THRESH, PSM2_MQ_RNDV_SHM_THRESH, PSM2_RCVTHREAD, PSM2_SHAREDCONTEXTS, PSM2_SHAREDCONTEXTS_MAX, and PSM2_TRACEMASK. These variable can be set by MCA if they are not already set in the environment. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-07-13 09:48:46 -06:00
Matias A Cabral	644641d06f	PSM and PSM2 MTLs check on the max message size allowed by API. OMPI send and receive mesages use size_t for the lenght while PSM and PSM2 psm(2)mq_send/receive use uint32_t. Type size_t is 64 bits in 64 bits arch. Therefore, this patch adds a sanity check on the lenght of the message and fails gracefully. Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>	2017-05-10 12:45:11 -07:00
Gilles Gouaillardet	1daa80d78f	mtl/psm2: plug a memory leak in ompi_mtl_psm2_component_open() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 09:28:32 +09:00
Ralph Castain	1e2019ce2a	Revert "Update to sync with OMPI master and cleanup to build" This reverts commit cb55c88a8b7817d5891ff06a447ea190b0e77479.	2016-11-22 15:03:20 -08:00
Ralph Castain	cb55c88a8b	Update to sync with OMPI master and cleanup to build Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2016-11-22 14:24:54 -08:00
Gilles Gouaillardet	eae9d31784	pre_condition_transports: code cleanup replace hard coded "OMPI_MCA_orte_precondition_transports" environment variable name with macro'ed OPAL_MCA_PREFIX"orte_precondition_transports"	2016-09-19 13:31:47 +09:00
Ralph Castain	ee56d9dc1a	Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field	2016-07-05 14:59:50 -07:00
Matias A Cabral	29ab28f4f6	Adding owner.txt file for PSM2 MTL.	2016-06-02 16:26:16 -07:00
Matias A Cabral	d28ee62a96	Update in PSM and PSM2 MTLs to detect entries created by drivers for Intel TrueScale and Intel OmniPath, and detect a link in ACTIVE state. This fix addresses the scenario reported in the below OMPI users email, including formerly named Qlogic IB, now Intel True scale. Given the nature of the PSM/PSM2 mtls this fix applies to OmniPath: https://www.open-mpi.org/community/lists/users/2016/04/29018.php	2016-05-09 12:08:44 -07:00
matcabral	9a1f9be146	A new internal feature in PSM2 will use hash tables to accelerate message queue lookups if the lookups have the proper tag&mask layout. OpenMPI should follow PSM2's preferred tag&mask spec, so that PSM2 can provide a performance benefit.	2015-12-14 10:13:39 -08:00
Matias A Cabral	ed16d8e1cc	Updated psm2 mtl with new externally exposed symbols of psm2.so Fixes open-mpi/ompi#1018 Fixes open-mpi/ompi#1021	2015-10-28 09:12:33 -07:00
Nathan Hjelm	53f6b57c0a	pml/cm: use the priority of the mtl component This commit changes the priority of mtl components to be relative to pml/ob1 and updates the mtl interface to expose this priority. cm now sets its own priority based on the priority of the selected mtl component. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2015-10-19 12:32:42 -06:00
Nathan Hjelm	987e865c99	mtl/psm2: add support for dynamic add_procs Add an accessor for the proc_endpoints[OMPI_PROC_ENDPOINT_TAG_MTL] member of the ompi_proc_t structure. This accessort calls add_procs with the ompi_proc_t if the member is NULL. Signed-off-by: Nathan Hjelm <hjelmn@me.com>	2015-09-10 08:55:55 -06:00
Ralph Castain	cf6137b530	Integrate PMIx 1.0 with OMPI. Bring Slurm PMI-1 component online Bring the s2 component online Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways. Bring the OMPI pubsub/pmi component online Get comm_spawn working again Ensure we always provide a cpuset, even if it is NULL pmix/cray: adjust cray pmix component for pmix Make changes so cray pmix can work within the integrated ompi/pmix framework. Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet Cleanup comm_spawn - procs now starting, error in connect_accept Complete integration	2015-08-29 16:04:10 -07:00
Andrew Friedley	2c9be59b37	Add new PSM2 MTL. This new MTL runs over PSM2 for Omni Path. PSM2 is a descendant of PSM with changes to support more ranks and some MPI-3 features like mprobe. PSM2 will only support Omni Path networks; PSM only supports True Scale. Likewise, the existing PSM MTL will continue to be maintained for True Scale, while the PSM2 MTL is developed and maintained for Omni Path.	2015-06-22 07:55:46 -07:00

24 Коммитов