keep track of the sizeof the blocklen_per_process and displs_per_process
on the aggregator datastructure to minimze the number of realloc function
calls required in the shuffle_init operation.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
This commit attempts to update the romio io component to not use
functions removed in MPI-3.0 (2012). This is a first cut and will
probably need to be reviewed for correctness.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
romio assumes that all predefined datatypes are contiguous. Because of
the (terribly named) composed datatypes MPI_SHORT_INT, MPI_DOUBLE_INT,
MPI_LONG_INT, etc this is an incorrect assumption. The simplest way to
fix this is to override the MPI_Type_get_envelope and
MPI_Type_get_contents calls with calls that will work on these
datatypes. Note that not all calls to these MPI functions are
replaced, only the ones used when flattening a non-contiguous
datatype.
References #5009
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fixes issue #5069, which relates a BigMPI bug with the use of
MPI_Type_vectpor to construct very large datatypes (>2GB).
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
An incorrectly named variable caused all pml variables to disappear
from ompi_info. This commit fixes the typo. We may add some logic into
the MCA base to catch these sorts of things in the future.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a segfault in mtl-portals4 finalize(). The segfault
occurs if finalize() is called without any calls to add_procs(). This
commit resolves the segfault by skipping the progress() loop in
finalize() if the Portals was not initialized.
Signed-off-by: Todd Kordenbrock (thkgcode@gmail.com)
This commit fixes a segfault in btl-portals4 add_procs(). The segfault
occurs if add_procs() is called after a del_procs() call that reduces
the proc count to zero which would cause PT and NI resources to be
freed. This commit resolves the segfault by using a common
initiailization boolean and only freeing module resources in
finalize().
Signed-off-by: Todd Kordenbrock (thkgcode@gmail.com)
Have Open MPI's PMIx component to set PMIx's "show_load_errors" to do
the same thing that Open MPI's "show_load_errors" does.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Some of the show_help() messages that were added in 40afd525f8 were
really normal / expected behavior (e.g., if 2 peers connect in the TCP
BTL more-or-less simultaneously, one of them will drop the connection
-- no need to show_help() about this; it's expected behavior). Roll
back these messages to be opal_output_verbose() kinds of messages.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This is a separate commit from the commit where the block was removed
from configure.ac because this NEWS bullet will almost certainly not
cherry-pick cleanly to release branches.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
We thought there was a silent data corruption issue on POWER 7/BE
systems, so we blocked building on POWER 7/BE systems altogether. We
later figured out that it was just data hangs -- not silent data
corruption. So in hindsight, the configure block probably wasn't
necessary -- but we didn't know it at the time.
Regardless, the hangs have now been fixed, and we're removing the
POWER 7/BE block in configure.
For more detail on the entire saga, see
https://github.com/open-mpi/ompi/issues/4349#issuecomment-374970982.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Per 0ab6b201fed, note in the MPI_Comm_spawn_multiple.3in man page that
the array_of_commands does not need to be terminated -- it just need
to have exactly "count" entries. In the Fortran binding, at least,
this is different than in prior released versions of Open MPI (it's
not a backwards incompatibility, since prior versions of Open MPI
required array_of_commands to be blank-string-terminated in Fortran --
this change makes Open MPI be *less* restrictive, and therefore still
backwards compatible).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
MPI defines the "argv" param to Fortran MPI_COMM_SPAWN as being
terminated by a blank string. While not precisely defined (except
through a non-binding example, Example 10.2, MPI-3.1 p382:6-29), one
can infer that the "array_of_argv" param to Fortran
MPI_COMM_SPAWN_MULTIPLE is also a set of argv, each of which are
terminated by a blank line.
The "array_of_commands" argument to Fortran MPI_COMM_SPAWN_MULTIPLE is
a little less well-defined. It is *assumed* to be of length "count"
(another parameter to MPI_COMM_SPAWN_MULTIPLE) -- and *not* be
terminated by a blank string. This is also given credence by the same
example 10.2 in MPI-3.1.
The previous code assumed that "array_of_commands" should also be
terminated by a blank line -- but per the above, this is incorrect.
Instead, we should just parse our "count" number of strings from
"array_of_commands" and *not* look for a blank line termination.
This commit separates these two cases:
* ompi_fortran_argv_blank_f2c(): parse a Fortran array of strings out
and stop when reaching a blank string.
* ompi_fortran_argv_count_f2c(): parse a Fortran array of strings out
and stop when "count" number of strings have been parsed.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
javah is no more available from Java 10, so try
javac -h first (available since Java 8) and fallback on javah
Refs. open-mpi/ompi#5000
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The strtoul function returns the pointer to the first non-digit character, which is a '.' in this case. Calling strtoul at that point will always yield a zero - you have to move past it to get the remaining number
Thanks to Greg Lee for the detailed analysis of the problem.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>