When romio314 was first pulled in an extra patch was applied to it, see commit
92f6c7c1e210c559471a05aaac9b19e0bd3d71bb. Most of that patch is already present
in vanilla romio321, but the fix for MPIO_DATATYPE_ISCOMMITTED() isn't.
If that macro doesn't set err_ then some paths end up with a variable being used
uninitialized. In particular you can trace through romio321/romio/mpi-io/read.c
to see what happens with error_code. It's an uninitialized stack variable that goes
through three MPIO_CHECK_* macros none of which set it. The macros consistently set
error_code to a failure if they see something wrong, but they don't consistently
set it to success when things are fine.
And then in the last macro MPIO_CHECK_DATATYPE it tries to look at the value
of error_code that was never set.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
(cherry picked from commit f413ef6b142fc3498bdb575d270f9afd0a840863)
The call of MPI_Allgather with sendbuf and sendtype parameters equal to MPI_IN_PLACE and NULL correspondingly, produces the segmentation fault.
The problem is that sendtype is used even when sendbuf value is MPI_IN_PLACE. But according to the standard, sendtype and sendcount parameters should be ignored in this case.
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
(cherry picked from commit 540c2d1)
- there is C11 naming conflict - atomic_init is C macro
which cause building issue
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 3295b238008a6b2e27941c60d757e64ad796fea2)
- in sine cases persistent request was deleted during completion
callback, this cause double free of linked UCX request (assert
in debug build or hang in release build)
- UCX request is freed prior completion callback
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 6fe0a73861b53efac12e740c831802057391525d)
Flag that we provided a notification and ignore it if it attempts to come back up.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit ea0d70bc9396def61545e2ce492a55c4c3aa7772)
Per https://github.com/open-mpi/ompi/issues/5031, if the user didn't specify a particular PMIx installation, then default back to the internal version if it is newer than the discovered external one. PMIx doesn't yet provide a full signature so we have to just get as close as possible for now.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 1e6aaf7f226f5a4d940e544079e3977229746c11)
- there worked progress was missed on startup which caused hang
on one of ranks
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit a081fba0465e0e03472fc45c9a4a7154f539e3f6)
Due to decreasing support by vendors/other orgs for the OpenIB BTL,
only look for iWarp/RoCE devices by default. Allow IB HCAs
with ports configured for ethernet.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
OFI BTL uses context for completion but never ask for it in
fi_getinfo(3). This commit makes sure that we always ask for FI_CONTEXT
to eliminate any potential error.
Signed-off-by: Thananon Patinyasakdikul <thananon.patinyasakdikul@intel.com>
OMPIO now uses the correct delete function depending on the fs
mca_common_ompio_file_delete now works this way instead
of calling POSIX unlink:
- create a minimal file handle with the given file name
- select the best fs component using this file handle
- call the component-specific file delete function
Signed-off-by: Gaëtan Bossu <gbossu@ddn.com>
It is needed because the fs components might be queried due to a MPI_File_delete call.
And in this case, we don't have a communicator value.
Signed-off-by: Gaëtan Bossu <gbossu@ddn.com>
instead of invoking ompi_request_test_all(), that will end up
calling opal_progress() recursively, manually check the status
of the requests.
the same method is used in ompi_comm_request_progress()
Refs open-mpi/ompi#3901
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Blocking `MPI_SCATTER` and `MPI_SCATTERV` were fixed in 506d0e96f4
but noblocking `MPI_ISCATTER` and `MPI_ISCATTERV` were not fixed yet.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>