btl/ofi: Disable EFA provider in versions earlier than libfabric 1.12.0
EFA incorrectly implements FI_DELIVERY_COMPLETE in earlier libfabric versions. While FI_DELIVERY_COMPLETE would be advertised by the provider, completions would return too early by not accounting for bounce buffers on the receive side. This would cause the BTL to receive early completions that lead to correctness issues. This is not an issue in the mtl/ofi as it does not require FI_DELIVERY_COMPLETE. Signed-off-by: William Zhang <wilzhang@amazon.com>
Этот коммит содержится в:
родитель
41df122083
Коммит
a7dcfd9874
@ -59,6 +59,17 @@ static int validate_info(struct fi_info *info, uint64_t required_caps)
|
||||
|
||||
BTL_VERBOSE(("validating device: %s", info->domain_attr->name));
|
||||
|
||||
/* EFA does not fulfill FI_DELIVERY_COMPLETE requirements in prior libfabric
|
||||
* versions. The prov version is set as:
|
||||
* FI_VERSION(FI_MAJOR_VERSION * 100 + FI_MINOR_VERSION, FI_REVISION_VERSION * 10)
|
||||
* Thus, FI_VERSION(112,0) corresponds to libfabric 1.12.0
|
||||
*/
|
||||
if (!strncasecmp(info->fabric_attr->prov_name, "efa", 3)
|
||||
&& FI_VERSION_LT(info->fabric_attr->prov_version, FI_VERSION(112,0))) {
|
||||
BTL_VERBOSE(("unsupported libfabric efa version"));
|
||||
return OPAL_ERROR;
|
||||
}
|
||||
|
||||
/* we need exactly all the required bits */
|
||||
if ((info->caps & required_caps) != required_caps) {
|
||||
BTL_VERBOSE(("unsupported caps"));
|
||||
|
Загрузка…
x
Ссылка в новой задаче
Block a user