1
1

18 Коммитов

Автор SHA1 Сообщение Дата
William Zhang
abe3aaa4a7 btl/ofi: Use common provider include/exclude list
The btl/ofi does not currently utilize the common ofi include/exclude
list. Added verification code similar to the mtl/ofi that will check if
the info object is in the include or exclude list. If it isn't in the
include list or is in the exclude list, validate_info will return
OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint
when calling getinfo, instead filtering the provider during
validate_info.

This patch also moves the is_in_list MTL function into common code and
adds additional debugging output to the BTL to match the MTL standard.

Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit 9b8f463a768206a26e5b1dcea0612a403462b1d0)
2020-08-10 14:28:59 -07:00
tomhers
229724799e BTL/OFI: Fix missing include file.
The missing include file causes an error when using an external version of LibEvent.

Signed-off-by: tomhers <tom.herschberg@gmail.com>
(cherry picked from commit 88f9d2c90f1730f2e0b5bc4951893f60ff5a1332)
2020-07-15 17:14:55 -04:00
Howard Pritchard
8fcf9cee3b add a common ofi whitelist/blacklist
also add common verbose variable.

Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized.  The BTL's are registered/initialized prior to the MTL components even getting registered.

Here's the change in ofi mtl mca parameters.  Before commit:

            MCA mtl ofi: parameter "mtl_ofi_provider_include" (current value: "psm2", data source: environment, level: 1 user/basic, type: string)
                          Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
             MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.

After commit:

             MCA btl ofi: parameter "btl_ofi_provider_include" (current value: "", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_include)
                          Comma-delimited list of OFI providers that are considered for use (e.g., "psm,psm2"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.
             MCA btl ofi: parameter "btl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
             MCA mtl ofi: parameter "mtl_ofi_provider_exclude" (current value: "shm,sockets,tcp,udp,rstream", data source: default, level: 1 user/basic, type: string, synonym of: opal_common_ofi_provider_exclude)
                          Comma-delimited list of OFI providers that are not considered for use (default: "sockets,mxm"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.
             MCA mtl ofi: parameter "mtl_ofi_verbose" (current value: "0", data source: default, level: 3 user/all, type: int, synonym of: opal_common_ofi_verbose)

related to #7755

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 9f1081a07ac3c7b7277a27277ed970ed713207c9)
(cherry picked from commit 45b643d0cfa46f1abb9a5f43cf0ff304cf6a5fea)
2020-06-23 14:02:33 -06:00
Harumi Kuno
61b11e4101 mtl_btl_ofi_rcache_init() before creating domain
mtl_btl_ofi_rcache_init() initializes patcher which should only take
place things are single threaded.  OFI providers may start spawn threads,
so initialize the rcache before creating OFI objects to prevent races.

Authored-by: John L. Byrne <john.l.byrne@hpe.com>
Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
(cherry picked from commit f1b21cb77680106be870ad29e5a7534862fceed7)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Nikola Dancejic
d9bfb1c4da common/ofi: Added multi-NIC support to provider selection
Adds the capability to select a NIC based on hardware locality.
Creates a list of NICs that share the same cpuset as the process,
then selects the NIC based on the (local rank) % (number of NICs).
If no NICs are available that share the same cpuset, the selection process
will create a list of all available NICs and make a selection based on
(local rank) % (number of NICs)

Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
(cherry picked from commit 167d75b42ac3ca4770d59c796c011d72e0fffde3)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Harumi Kuno
2e137ae31c set ep to NULL to avoid double close
Per suggestion of @awlauria

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
(cherry picked from commit ab4875ddc2b76ceced120fbfe09d8dcbde6e6ff3)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Harumi Kuno
faaaf5ca1c Add comments about order of close ops
Per suggestion of @awlauria, added some comments about
the need to free ep before resources it points to.

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
(cherry picked from commit 1bc3dab118bb694d932ef365a7a922e514f82a20)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
harumi kuno
bb57080c84 Fix mca_btl_ofi_finalize clean-up logic
This fix is from John L. Byrne (john.l.byrne@hpe.com).

When OFI Libfabric binds objects to endpoints, before the object can
be successfully closed, the endpoint must first be freed.  For scalable
endpoints, objects can also be bound to transmit and receive contexts,
and for objects that are bound to contexts, we need to first free the
contexts before freeing the endpoint. We also need to clear the memory
registration cache.

If we don't clean up properly, then fi\_close may not be able to close
the domain because the dom will have a non-zero ref count.

Signed-off-by: harumi kuno <harumi.kuno@hpe.com>
(cherry picked from commit 3095fabf94de249716fc146a4c4a609bd33a1aff)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:19 +00:00
Thananon Patinyasakdikul
f9439c6d18 btl/ofi: Added 2 side communication support.
The 2 sided communication support is added for non-tagmatching provider
to take advantage of this BTL and PML OB1. The current state is
"functional" and not optimized for performance.

Two sided support is disabled by default and can be turned on by mca
parameter: "mca_btl_ofi_mode".

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
(cherry picked from commit 080115d44069e0c461a1af105cd41f28849cdffc)
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Brian Barrett
e975c9975c Revert "Remove the OFI/BTL component"
This reverts commit 192f0f6fff4b4ba0f207054626c935941ff0b5d8.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2020-06-17 20:49:18 +00:00
Ralph Castain
192f0f6fff Remove the OFI/BTL component
Remove this component pending re-architecture of the overall OFI
components. We have had similar issues before when multiple components
use the same library - typical issues are race conditions, initialize
and finalize errors, etc. We are seeing similar problems here as we get
broader exposure to different library version and environment
combinations.

The correct fix in the past has been to centralize the library
interactions in a "common" component. We will pursue that here by moving
some additional functions (e.g., endpoint creation) into the existing
opal/mca/common/ofi component. We can't do that and thoroughly test it
in time for the v4.0.0 release, so we'll simply remove this component
from the release.

Once we have things correctly fixed, we'll submit a PR to restore the
component plus the related fixes to some future v4.x release.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-09-20 18:09:15 -07:00
Thananon Patinyasakdikul
033c364ee0 btl/ofi: Added FI_CONTEXT as requirement.
OFI BTL uses context for completion but never ask for it in
fi_getinfo(3). This commit makes sure that we always ask for FI_CONTEXT
to eliminate any potential error.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-07-17 12:18:43 -07:00
Thananon Patinyasakdikul
be76896f7c btl/ofi: progress now happens after a threshold.
This commit changed the way btl/ofi call progress. Before, we force
progression with every rdma/atomic call. This gives performance boost in
some case and slow down on others. Now we only force progression after
some number of rdma calls which result in better performance overall.

Also added new MCA parameter 'mca_btl_ofi_progress_threshold' to set
the threshold number. The new default is 64.

Also:
Added FI_DELIVERY_COMPLETE to tx_rtx flags to ensure that the completion
is generated after the message has been received on the remote side.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-26 10:39:45 -07:00
Thananon Patinyasakdikul
dae3c9447c btl/ofi: add scalable endpoint support.
This commit add support for scalable endpoint to enhance multithreaded
application performance. The BTL will detect the support from ofi
provider and will fallback to normal usage of scalable endpoint is not
supported.

NEW MCA parameters:
- mca_btl_ofi_disable_sep: force the btl to not use scalable endpoint.
- mca_btl_ofi_num_contexts_per_module: number of communication context
  to create (should be the same as number of thread).

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-14 15:44:29 -07:00
Thananon Patinyasakdikul
b6e07bfac6 btl/ofi: change required mr mode bits.
FI_MR_UNSPEC is not supposed to be used beyond ofi version 1.5. This
commit replaces FI_MR_UNSPEC with the new FI_MR_BASIC mode bits
(FI_MR_PROV_KEY | FI_MR_ALLOCATED | FI_MR_VIRT_ADDR).

The btl functionality remains the same.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-11 09:25:57 -07:00
Thananon Patinyasakdikul
f34a73af70 btl/ofi: handles FI_EINTR properly.
OFI sockets provider will sometime return FI_EINTR. Apparently this flag
does not exist in OFI version 1.5. This commit added an ifdef check.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-05 18:40:00 -07:00
Thananon Patinyasakdikul
c18cf7315f btl/ofi: configury: will not build if OFI ver < 1.5.
This commit fixes problem where users have libfabric version < 1.5
installed and build failed because the new MR modebits does not exist.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-04 10:35:50 -07:00
Thananon Patinyasakdikul
6efb8069ec new btl/ofi: RDMA only btl using libfabric.
This commit added new transport layer to be used with osc rdma module.
This BTL provides put, get, atomic and fetch atomic operations. It can
be used with multiple hardware vendors as long as they have their
provider under Libfabric and have the right capabilities.

Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
2018-06-01 15:22:04 -07:00