- in sine cases persistent request was deleted during completion
callback, this cause double free of linked UCX request (assert
in debug build or hang in release build)
- UCX request is freed prior completion callback
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 6fe0a73861)
Flag that we provided a notification and ignore it if it attempts to come back up.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit ea0d70bc9396def61545e2ce492a55c4c3aa7772)
Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 1e6aaf7f22)
- there worked progress was missed on startup which caused hang
on one of ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit a081fba046)
Due to decreasing support by vendors/other orgs for the OpenIB BTL,
only look for iWarp/RoCE devices by default. Allow IB HCAs
with ports configured for ethernet.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
OFI BTL uses context for completion but never ask for it in
fi_getinfo(3). This commit makes sure that we always ask for FI_CONTEXT
to eliminate any potential error.
Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
OMPIO now uses the correct delete function depending on the fs
mca_common_ompio_file_delete now works this way instead
of calling POSIX unlink:
- create a minimal file handle with the given file name
- select the best fs component using this file handle
- call the component-specific file delete function
Signed-off-by: Gaëtan Bossu <gbossu@ddn.com>
It is needed because the fs components might be queried due to a MPI_File_delete call.
And in this case, we don't have a communicator value.
Signed-off-by: Gaëtan Bossu <gbossu@ddn.com>
instead of invoking ompi_request_test_all(), that will end up
calling opal_progress() recursively, manually check the status
of the requests.
the same method is used in ompi_comm_request_progress()
Refs open-mpi/ompi#3901
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Blocking `MPI_SCATTER` and `MPI_SCATTERV` were fixed in 506d0e96f4
but noblocking `MPI_ISCATTER` and `MPI_ISCATTERV` were not fixed yet.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Move memory hooks init (for request based operation) in osc ucx to window
creation time, to avoid performance issue in MPI initialization.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
This commit fixes two bugs in the RMA/atomic emulation code:
1) Fix a fragment leak when using AMO emulation.
2) Always initialize the single-copy emulation code. This is required
to use the AMO support.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>