I found that I needed to apply the same change as #5597 to orterun.c for the environment variables to work correctly.
Signed-off-by: Simon Byrne <simonbyrne@gmail.com>
(cherry picked from commit 9c8671c48b946f4387cddb6a66aaab572fa983dd)
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit d202e10c1407d2f9177e9b871eadde1f25526676)
To avoid fully initializing the osc/ucx component for MPI application
that are not using One-Sided functionality, the initialization happens
at the first MPI window creation.
This commit ensures atomicity of global state modifications.
ported from: 6678ac0f557935b291ec2310216b7ea46e0c13b1
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
fix alignment, and fix error path
a non blocking collective might return ompi_request_null, so we should not
retain anything in that case.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@63d3ccde9d)
Since ompi_coll_base_nbc_request_t is to be used in an
opal_free_list_t, it must be returned into a "clean" state.
So cleanup some data in the callback completion subroutines.
This fixes a regression introduced in open-mpi/ompi@0fe756d416
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@0862c409f1)
base ompi_coll_libnbc_request_t on top of ompi_coll_base_nbc_request_t
to correctly support the retention of datatypes/operators
This fixes a regression introduced in open-mpi/ompi@0fe756d416
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@f8eef0fde9)
Provide a missing header and paren
Thanks to @zerothi for the assistance
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit bd5a1765eea200651babc5bfd9f45a9f3cedefbc)
Override the defaults when provided. Ignore LSF binding file if user
overrides by specifying a policy.
Fixes#6631
Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit ea0dfc321809db50f78e742da1d22f9ef59650a3)
Use linear with sync alltoall algorithm for certain message/comm size
ranges. Does not affect default fixed decision, unless HPCX (with its
custom parameters) is used or corresponding mca is set.
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
(cherry picked from commit 404c4800688548b021bda68bdf10792424e6b1c5)
These issues were introduced in the recent commit b71af0eca0.
This commit fixes Coverity CID 1451661 and 1451660.
Though `c_info` part was an actual bug, the `c_sendtypes` part was not.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit open-mpi/ompi@facf8c5e98)
- ignore sendcounts, sendispls and sendtypes arguments when MPI_IN_PLACE is used
- use the right size when an inter-communicator is used.
Thanks Markus Geimer for reporting this.
Refs. open-mpi/ompi#5459
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@cdaed89d04)
Commit d7053a3 broke things for the case when Open MPI 4.0.x is built
without UCX support. Problem was it was trying to partially initialize
the btl to try and delay printing of a help message till wireup. Well
this sort of doesn't work in all cases. Rather than keep piling on
changes to support a help message for a BTL that we are deprecating, take
a keep it simple stupid approach.
So, revert most of d7053a3 and instead put the help message back in the
original location, during scan of ports of the available HCAs to check
for whether or not link layer for that port is configured for ethernet or infiniband.
If Open MPI was built with UCX support, don't emit the help message, if
UCX was not linked in, emit the help message.
Verified on a system with connectX5 HCAs configured with two ports configured
for ethernet and two for infiniband.
relates to #6785
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
MPI standard states a user MPI_Op and/or user MPI_Datatype can be free'd
after a call to a non blocking collective and before the non-blocking
collective completes.
Retain user (only) MPI_Op and MPI_Datatype when the non blocking call is
invoked, and set a request callback so they are free'd when the MPI_Request
completes.
Thanks Thomas Ponweiser for reporting this
Fixesopen-mpi/ompi#2151Fixesopen-mpi/ompi#1304
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit open-mpi/ompi@0fe756d416)
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
Correctly bubble up errors in NBC collective operations
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
The error field of requests needs to be rearmed at start, not at create
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
(cherry picked from commit open-mpi/ompi@65660e5999)
In common_ompi_aggregators calc_cost routine:
do not cast the real division to an int intermediately.
This patch removes the obsolete int variable c and assigns
the result of the P_a/P_x division directly to n_as.
With the intermediate int c variable, n_as gets 0 if P_a < P_x,
resulting in a division by 0 when computing n_s.
Signed-off-by: Harald Klimach <harald.klimach@uni-siegen.de>
(cherry picked from commit e222a04ae57e5d09b8559f3c111de1f10a47246a)