- Add some missing AC_CHECK_SIZEOF's in configure.ac
- Remove some unused variables
- Initialize some variables
- Fix some parameter types
- Cast where appropriate/safe to fix warnings
- Move ompi/mca/common/monitoring Fortran bindings to a separate .c
file so that they can use different #define's than the C bindings,
and therefore compile properly / without warnings.
- Fix signedness discrepancies
- Who knew? Separated these into multiple #if's, instead:
```
// This is undefined behavior
#define HAVE_FOO defined(FOO)
#define YOW (HAVE_FOO && defined(BAR))
```
- Fix some typos in OMPI_BUILD_HOST logic
- Don't "2>/dev/null" in OMPI_BUILD_HOST logic; it just hides errors
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
MPI_Ialltoallw() and friends take a const MPI_Datatype types[] argument.
In order to be able to call OBJ_RELEASE(types[0]), we used to simply
drop the const modifier. This change make it right by introducing the
OBJ_RELEASE_NO_NULLIFY(object) macro that no more set object = NULL
if the object is freed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Ensure we correctly collect and save the cpuset of the process
separately from its locality string. Ensure we use the correct one when
computing things like relative locality between processes.
Signed-off-by: Ralph Castain <rhc@pmix.org>
A mindless task for a lazy weekend: convert all the README and
README.txt files to Markdown. Paired with the slow conversion of all
of our man pages to Markdown, this gives a uniform language to the
Open MPI docs.
This commit moved a bunch of copyright headers out of the top-level
README.txt file, so I updated the relevant copyright header years in
the top-level LICENSE file to match what was removed from README.txt.
Additionally, this commit did (very) little to update the actual
content of the README files. A very small number of updates were made
for topics that I found blatently obvious while Markdown-izing the
content, but in general, I did not update content during this commit.
For example, there's still quite a bit of text about ORTE that was not
meaningfully updated.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Co-authored-by: Josh Hursey <jhursey@us.ibm.com>
8017f12 introduced a new function to get the package rank of a process,
which had a pass-by-value signature (opal_process_info_t); and coverity
was not happy about it. This commit changes the signature to take a
reference to opal_process_info_t instead.
Signed-off-by: Raghu Raja <craghun@amazon.com>
If PMIX_PACKAGE_RANK is available, uses this value to select between multiple
NIC of equal distance between the current process. If this value is not
available, try to calculate it by getting the locality string from each local
process and assign a package_rank. If everything fails, fall back to using
process_id.rank to select the NIC. This last case is not ideal, but has a small
chance of occuring, and causes an output to be displayed to notify that this is
occuring.
Signed-off-by: Nikola Dancejic <dancejic@amazon.com>
Coverity complained about uninitialized variables; ensure that they
are initialized to 0 in all cases.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
There was a bug allowing for partial packing of non-data elements (such as loop
and end_loop markers) during the exit condition of a pack/unpack call. This has
basically no meaning. Prevent this bug from happening by making sure the element
point to a data before trying to partially pack it.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
In the below type creation sequence
MPI_Type_create_resized(MPI_INT, 0, 6, &mydt1);
MPI_Type_contiguous(1, mydt1, &mydt2);
I think both mydt1 and mydt2 should have extent 6.
The Type_create_resized would add an UB marker into the type map,
and the definition of Type_contiguous would maintain the same
markers in the new map.
The only counter argument I can think of to the above is if
we declared that mydt1 is illegal because it's putting data
on addresses that don't satisfy the alignment requirement.
But in my interpretation of the standard the term "alignment
requirement" is a property of the system memory, and MPI defines
"extent" in a way to make it easy to create MPI datatypes that
support the system's alignment requirements. But the standard
isn't saying it's illegal to make MPI datatypes that don't satisfy
the system's alignment requirements. I think this is true also
because the MPI datatypes might be used in file IO where the
requirements are different, so that's my long winded explanation
for why I don't think we can declare mydt1 illegal.
Complete example:
#include <stdio.h>
#include <mpi.h>
int main() {
MPI_Datatype mydt1, mydt2;
MPI_Aint lb, ext;
MPI_Init(0, 0);
MPI_Type_create_resized(MPI_INT, 0, 6, &mydt1);
MPI_Type_commit(&mydt1);
MPI_Type_contiguous(1, mydt1, &mydt2);
MPI_Type_commit(&mydt2);
MPI_Type_get_extent(mydt1, &lb, &ext);
printf("mydt1 extent %d\n", (int)ext);
MPI_Type_get_extent(mydt2, &lb, &ext);
printf("mydt2 extent %d\n", (int)ext);
MPI_Type_free(&mydt1);
MPI_Type_free(&mydt2);
MPI_Finalize();
return(0);
}
% mpicc -o x test.c
% mpirun -np 1 ./x
Without this PR the output is
> mydt1 extent 6
> mydt2 extent 8
With this PR both extents are 6.
Fwiw I also tested with mpich and they give 6 for both extents.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves the PMIx library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL. Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components. Any pre-installed version of PMIx 3.0.0 or
newer is preferred over the internal version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves the hwloc library bundled with Open MPI from a
MCA framework to a stand-alone library built outside of OPAL. Due
to the amount of code in the MCA base (and its assumptions about
being part of an MCA framework), the framework is left with no
active components. Any pre-installed version of HWLOC 1.6 or
newer is preferred over the internal version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
With Open MPI 5.0, the decision was made to stop building
3rd-party packages, such as Libevent, HWLOC, PMIx, and PRRTE as
MCA components and instead 1) start relying on external libraries
whenever possible and 2) Open MPI builds the 3rd party
libraries (if needed) as independent libraries, rather than
linked into libopen-pal.
This patch moves libevent from an MCA framework to a stand-alone
library built outside of OPAL. A wrapper in opal/util is provided
to minimize the unnecessary changes in the rest of the code. When
using the internal Libevent, it will be installed as a stand-alone
libevent.a, instead of bundled in OPAL. Any pre-installed version
of Libevent at or after 2.0.21 is preferred over the internal
version.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
The Linux component was an attempt to hook calls by patching the dynamic
symbol table. It, unfortunately, does not work as it will always miss
calls made internally by glibc. For example, it might catch a user call
directly to munmap but will miss the chain free -> munmap. Since the
later is the common case we were trying to hook this made the component
unusable. This PR finally kills the component.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This PR removes the MCA_BTL_DES_FLAGS_PUT and MCA_BTL_DES_FLAGS_GET
descriptor flags. At some point these had some meaning but they were
replaced by the rcache access flags.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
The ofi_rxm provider is dependent upon the underlying hardware for its
implementation of FI_DELIVERY_COMPLETE. Since this can lead to early
completions, we disable the provider to avoid correctness issues.
This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.
Signed-off-by: William Zhang <wilzhang@amazon.com>
The btl/ofi does not currently utilize the common ofi include/exclude
list. Added verification code similar to the mtl/ofi that will check if
the info object is in the include or exclude list. If it isn't in the
include list or is in the exclude list, validate_info will return
OPAL_ERROR. The btl/ofi will no longer pass a provider name as a hint
when calling getinfo, instead filtering the provider during
validate_info.
This patch also moves the is_in_list MTL function into common code and
adds additional debugging output to the BTL to match the MTL standard.
Signed-off-by: William Zhang <wilzhang@amazon.com>
EFA incorrectly implements FI_DELIVERY_COMPLETE in earlier libfabric
versions. While FI_DELIVERY_COMPLETE would be advertised by the
provider, completions would return too early by not accounting for
bounce buffers on the receive side. This would cause the BTL
to receive early completions that lead to correctness issues.
This is not an issue in the mtl/ofi as it does not require
FI_DELIVERY_COMPLETE.
Signed-off-by: William Zhang <wilzhang@amazon.com>