1
1
Граф коммитов

10434 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
a6d97e2d6d
Merge pull request #7815 from raafatfeki/topic/ime
Topic/ime: Bring over IME component from master to v4.1
2020-06-26 12:40:35 -07:00
raafatfeki
6e145188d9 fs/ime & fbtl/ime: Support of IME file system
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2020-06-26 12:26:51 -04:00
William Zhang
db6ed187b2 coll/tuned: Add NULL check to prevent segfault
Signed-off-by: William Zhang <wilzhang@amazon.com>

cr https://code.amazon.com/reviews/CR-23837553

(cherry picked from commit 771f9c011d)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
William Zhang
03758b1ef7 coll/tuned: Fix typos
Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 50640402ab)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Brinskii
7eb94164a0 COLL/TUNED: Add linear scatter using isend for mlnx platform
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit f2cbd4806e)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Gilles Gouaillardet
221fad6862 coll/cuda: remove unnecessary references to ORTE
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 531171ca50)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Tomislav Janjusic
f51bd8ca0c Coll/hcoll: adding scatterv interface
Signed-off-by: Valentin Petrov valentinp@mellanox.com
(cherry picked from commit 6ea920e225)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Alex Anenkov
2891a23329 coll/libnbc: add recursive doubling algorithm for MPI_Iallreduce
Signed-off-by: Alex Anenkov <anenkov.ru@gmail.com>
(cherry picked from commit 77d466edf3)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
ba11f31fc8 coll/libnbc: remove debug output
1. Remove debug output in iallgather (I have forgotten to remove it).
2. Remove an incorrect comment in description of ibcast

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 64abd0f405)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
bf1c8bb394 coll/libnbc/ireduce: silence Coverity warning CID 1440360
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 8b511c7889)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
5ee1fb62b9 coll/libnbc: add Rabenseifner's algorithm for MPI_Iallreduce
An implementation of R. Rabenseifner's algorithm for MPI_Iallreduce.

This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by an allgather.

Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 73e048b62a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
George Bosilca
fd29cce114 Remove few warnings in libnbc identified by clang-1000.11.45.2
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 66182a294d)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
91a4b4c799 coll/libnbc: add recursive doubling algorithm for MPI_Iallgather
Implements recursive doubling algorithm for MPI_Iallgather.
The algorithm can be used only for power-of-two number of processes.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit a7386c1e09)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
6971dab943 coll/libnbc: add knomial tree algorithm for MPI_Ibcast
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit b0429d25df)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
a318f117f6 coll/libnbc: add Rabenseifner's algorithm for MPI_Ireduce
An implementation of R. Rabenseifner's algorithm for MPI_Ireduce.
This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by a gather.

Limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators only

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 7bd63e79c8)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Brian Barrett
6f6d8180a3 coll libnbc: Remove dead code
Remove dead code that was causing warnings about unused static
functions.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 2e24e6ec08)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
de5e435dee coll/libnbc: add recursive doubling algorithm for MPI_Iexscan
Implements recursive doubling algorithm for MPI_Iexscan.
The algorithm preserves order of operations so it can be used both
by commutative and non-commutative operations.

The MCA parameter 'coll_libnbc_iexscan_algorithm' was added for dynamic
algorithm selection.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit dfe203e167)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
65990af3ad coll/libnbc: add recursive doubling algorithm for MPI_Iscan
Implements recursive doubling algorithm for MPI_Iscan. The algorithm preserves order of operations so it can be used both by commutative and non-commutative operations.

The MCA parameter coll_libnbc_iscan_algorithm was added for dynamic algorithm selection.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 3d43ff0f32)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Jeff Squyres
547fb3d933 libnbc: remove some stale/dead code
Gcc 8 identified hb_tree_csearch() as an infinite recursion, and it
turns out that we never call this function, anyway.  So just remove
it.

Fixes #5670.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 06c1bf73da)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Aurelien Bouteiller
2692840d40 Always return a valid error code from collective operations
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
(cherry picked from commit 466217fadd)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Gilles Gouaillardet
d9d84d5dd6 coll/libnbc: fix NBC_Unpack()
always initialize 'size'.

Only the a2a_sched_diss() alltoall algorithm is impacted,
and this algo is currently unused, so there is no need
to backport nor update the NEWS file for now.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit ff48e92864)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Mikhail Kurnosov
ba221e1a08 coll/base/allgatherv: fix MPI_IN_PLACE processing
The call of MPI_Allgatherv with sendbuf and sendtype parameters equal to MPI_IN_PLACE and NULL correspondingly, produces the segmentation fault.

The problem is that sendtype is used even when sendbuf value is MPI_IN_PLACE. But according to the standard, sendtype and sendcount parameters should be ignored in this case.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit b45e190e66)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-25 23:06:51 +00:00
Jeff Squyres
465414953d
Merge pull request #7828 from edgargabriel/pr/v4.1.x-avg-fview-size
common/ompio: use avg. file view size in the aggregator selection logic
2020-06-25 11:45:48 -04:00
Jeff Squyres
a3258afad9
Merge pull request #7814 from raafatfeki/topic/gpfs_4.1.x
fs/gpfs: Bring over GPFS component from master to v4.1
2020-06-25 10:47:39 -04:00
Jeff Squyres
2ba26c746d
Merge pull request #7831 from hppritcha/backports/v4.1.x-psm2-updates
Backports/v4.1.x psm2 updates
2020-06-25 10:40:38 -04:00
Brian Barrett
9550cb2a97
Merge pull request #7863 from hppritcha/topic/patch_for_ofi_v41x
add a common ofi whitelist/blacklist
2020-06-24 10:55:03 -07:00
Howard Pritchard
8fcf9cee3b add a common ofi whitelist/blacklist
also add common verbose variable.

Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized.  The BTL's are registered/initialized prior to the MTL components even getting registered.

Here's the change in ofi mtl mca parameters.  Before commit:

            MCA mtl ofi: parameter "mtl_ofi_provider_include" (current value: "psm2", data source: environment, level: 1 user/basic, type: string)
                          Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
             MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.

After commit:

             MCA btl ofi: parameter "btl_ofi_provider_include" (current value: "", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_include)
                          Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
             MCA btl ofi: parameter "btl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
             MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
             MCA mtl ofi: parameter "mtl_ofi_verbose" (current value: "0", data source: default, level: 3 user/all, type: int, synonym of: opal_common_ofi_verbose)

related to #7755

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 9f1081a07a)
(cherry picked from commit 45b643d0cf)
2020-06-23 14:02:33 -06:00
Jeff Squyres
6d550a32ca
Merge pull request #7856 from hjelmn/v4.1.x_add_a_missing_osc_rdma_commit_for_dynamic_memory_windows_that_was_forgotten_before_the_last_release_doh
osc/rdma: fix bug in attach for non-debug builds
2020-06-22 17:16:42 -04:00
Nathan Hjelm
059c9614a6 osc/rdma: fix bug in attach for non-debug builds
This commit fixes an issue with non-debug builds where adding an
attachment to the attachment list doesn't actually happen. This
causes all MPI_Win_detach calls to fail. The call was within an
assert which is optimized out in optimized builds.

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
(cherry picked from commit 8ee80d8855)
2020-06-22 08:34:43 -06:00
Joseph Schuchart
1af51d07d6 osc rdma: check for outstanding fragments before completing a request
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
(cherry picked from commit 85ed26f2f8)
2020-06-22 08:32:14 +02:00
Nikola Dancejic
d9bfb1c4da common/ofi: Added multi-NIC support to provider selection
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 167d75b42a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Brian Barrett
62bfd52870 ofi: Call add_procs through PML
Change ompi_mtl_ofi_get_endpoint() to call the active PML's
add_procs() rather than the OFI MTL add_procs() directly when
discovering a new process during operation.

Functionally, this has no impact in correct operation.  However,
the current behavior means that the heterogenous and active PML
checks are not being executed in the dynamic discovery case.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 64d70b3076)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Robert Wespetal
2a36c02a0c mtl/ofi: Add workaround for EFA local/remote capabilities bug
Some versions of Libfabric contain a bug in EFA where FI_REMOTE_COMM and
FI_LOCAL_COMM are not advertised. In order to workaround this, we need to call
fi_getinfo() without those capability bits to see if EFA is available first.

Also move around some of the provider include/exclude list logic so we can skip
this workaround if applicable.

Signed-off-by: Robert Wespetal <wesper@amazon.com>
(cherry picked from commit 49128a7adb)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Jeff Squyres
ac66f46dc7 mtl/ofi: check for FI_LOCAL_COMM+FI_REMOTE_COMM
Make sure to get an RDM provider that can provide both local and
remote communication.  We need this check because some providers could
be selected via RXD or RXM, but can't provide local communication, for
example.

Add OPAL_CHECK_OFI_VERSION_GE() m4 macro to check that the Libfabric
we're building against is >= a target version.  Use this check in two
places:

1. MTL/OFI: Make sure it is >= v1.5, because the FI_LOCAL_COMM /
   FI_REMOTE_COMM constants were introduced in Libfabric API v1.5.
2. BTL/usnic: It already had similar configury to check for Libfabric
   >= v1.1, but the usnic component was checking for >= v1.3.  So
   update the btl/usnic configury to use the new macro and check for
   >= v1.3.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 21bc9042e1)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Jeff Squyres
3262df243c mtl/ofi: add a .gitignore
Ignore generated files.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit ac54d771ec)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Matias A Cabral
585a40adb7 MTL_OFI: Changed Recv cancel to be non-blocking
Updated the OFI MTL's Recv cancel to be a non-blocking call to match
the MPI spec. Given fi_cancel succeeded, then it is expected that the
user will wait on the request to read the result of if the cancel has
completed.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com
(cherry picked from commit 25bdd118ac)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
13853aa75b mtl/ofi: Fix segfault when not using Thread-Grouping feature
For the non thread-grouping paths, only the first (0th) OFI context
should be used for communication. Otherwise this would access a non existant
array item and cause segfault.

While at it, clarifiy some content regarding SEPs in README (Credit to Matias Cabral
for README edits).

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 6edcc479c4)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Jeff Squyres
2dd7aa0587 mtl/ofi/Makefile.am: down with tabs!
Replace all tabs with spaces.  No code or logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit aba2571881)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Gilles Gouaillardet
a7045bceef mtl/ofi: fix configury when VPATH is used
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit 945f830f7a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
48df4efb56 mtl/ofi: Fix reference to help text object
When we exceed the threshold number of contexts created, print appropriate help
text

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 9cabcfdbba)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Brian Barrett
a2cf9a41e3 mtl/ofi: Provide av count hint during initialization
Provide the av_attr.count hint (number of addresses that will be
inserted into the address vector through the life of the process)
at initialization of the address vector.  It's ok to be a bit
wrong, but some endpoints (RxR) can benefit by not going through
the slow growth realloc churn.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 44be7f139a)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Brian Barrett
29e8544243 mtl/ofi: Print descriptive error message on modex failure
With MTLs, there's no "other transport" when the remote side
does not have an active NIC, so we should print a useful error
message when the modex failed (indicating lack of a NIC on
the remote side).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit fe25097194)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
6a27da6d7f mtl/ofi: Add MCA variables to enable SEP and to request number of OFI contexts
Moving to a model where we have users actively _enable_ SEP feature for use
rather than opening SEP by default if provider supports it. This allows us to
not regress (either functionally or for performance reasons) any apps that were
working correctly on regular endpoints.

Also, providing MCA to specify number of OFI contexts to create and default
this value to 1 (Given btl/ofi also creates one by default, this reduces the
incidence of a scenario where we allocate all available contexts by default and
if btl/ofi asks for one more, then provider breaks as it doesn't support it).

While at it, spruce up README on SEP content.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 37f9aff2a0)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Spruit, Neil R
f770b6cfa1 MTL_OFI: Generation of specialized functions at build time
-> Added new targets in Makefile.am to call a new build script
   generate-opt-funcs.pl to generate specialized functions for
   each *.pm file.

-> Added new perl module *.pm files for send,isend,irecv,iprobe,improbe
   which are loaded by generate-opt-funcs.pl to create new source files
   that correspond to the name of the .pm file to be used as part of
   MTL OFI.

-> Added mtl_ofi_opt.pm.template and updated README with details on the
   specialization features and how to add additional specialization
   support.

-> Added new opt_common/mtl_ofi_opt_common.pm containing common
   functions for generating the specialized functions used by
   all other *.pm modules.

-> Added new mtl_ofi.h which includes the definitions for the
   function symbol table for storing the specialized functions along
   with the definitions for the initialization functions for the
   corresponding function pointers.

-> Based off the OFI provider capabilities the specialized function
   pointers are assigned at mtl_ofi_component_init to the corresponding
   MTL OFI function.

-> mca_mtl_ofi_module_t has been updated with the symbol table
   struct which is assigned at component init.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
(cherry picked from commit bef5f50a42)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
3858b51d11 Fix for SEP when num local procs is greater than available contexts
For cases when the number of local processes is greater than the number of
available contexts, the SEP initialization phase would calculate the number of
contexts to provision for each rank to be 0 and would eventually crash.

Fix the issue here by using regular endpoints in the event the number of local
processes is more than available contexts. This fixes issue #6182.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit e5e19dfcf7)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Brian Barrett
4b293d3823 mtl/ofi: Fix crash if no providers found
Commit 109d0569ff introduced a crash when an error occurred
before ofi_ctxt was allocated, including when no providers
passed the selection logic.  Properly check that the pointer
is not NULL in the error cleanup code before dereferencing
the pointer.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit 6e15128d96)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Aravind Gopalakrishnan
22d0857ee5 MTL/OFI: Add OFI Scalable Endpoint support
OFI MTL supports OFI Scalable Endpoints feature as means to improve
multi-threaded application throughput and message rate. Currently the feature
is designed to utilize multiple TX/RX contexts exposed by the OFI provider in
conjunction with a multi-communicator MPI application model. For more
information, refer to README under mtl/ofi.

Reviewed-by: Matias Cabral <matias.a.cabral@intel.com>
Reviewed-by: Neil Spruit <neil.r.spruit@intel.com>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 109d0569ff)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Aravind Gopalakrishnan
ee3f6ab841 MTL OFI: Ask for FI_THREAD_DOMAIN support when not using MPI_THREAD_MULTIPLE
When an application is not using multiple threads to call into MPI, we can
safely ask for FI_THREAD_DOMAIN setting from the provider as it should
translate to the least amount of locking in provider.

Conversely, for applications using THREAD_MULTIPLE, explicitly ask for
FI_THREAD_SAFE to prevent race conditions.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit 5cbcae79d8)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Matias Cabral
02ac75434a MTL OFI: Add support for mem_tag_format
OFI providers may reserve some of the upper bits of the tag for
internal usage and expose it using mem_tag_format. Check for that
and adjust communicator bits as needed.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
(cherry picked from commit d996f529c0)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Michael Heinz
b680893917 Add check for PSM2 reference counting to PSM2 MTL #7721
As discussed, a feature is being added to libpsm2 to correctly handle
the case where the library is opened by multiple OMPI transports in the same
process. (For example, the OFI BTL and the PSM2 MTL).

* Improved error message to indicate required libpsm2 version.

* Adds a test at autogen/configure time for the existence of
  PSM2_LIB_REFCOUNT_CAP.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
(cherry picked from commit f10305a49f)
2020-06-16 10:38:22 -06:00