We've seen this a few times (e.g.,
http://www.open-mpi.org/community/lists/users/2015/06/27057.php
reported via @siegmargross). I'm not entirely sure why it happens --
the best I can come up with is a poorly-synchronized network
filesystem and/or a bug in "make". For example: this code hasn't
changed in forever, and it only happens to users *sometimes*.
Regardless, avoid the error altogether by removing the file before
making the sym link (it should be a sym link anyway -- if there's
something there, it should be safe to remove it before we re-create
the sym link that should be there in the first place).
(cherry picked from commit 0edd265ea045e649c9489e3cb8fdb657800d95c3)
The Portals4 MTL allocates two Portals IDs requesting specific
well-known IDs and assumes that those IDs are allocated. If those IDs
are in use, PtlPTAlloc() will allocate a different ID. This commit
verifies that the requested IDs were allocated.
CID 71734 Self assignment (NO_EFFECT)
This code has no effect. The original author of the offending code
does not remember why the self-assignment is there. Fortran
MPI_Win_get_attr tests are working with or without it so remove the
code.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds support for MPI_Aint_add and MPI_Aint_diff. These
functions are implemented as macros in C (explicitly allowed by
MPI-3.1). The fortran implementations are a similar mess to the
MPI_Wtime implementations.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
CID 1269683 Unchecked return value (CHECKED_RETURN)
CID 1269684 Unchecked return value (CHECKED_RETURN)
Use ompi_comm_rank instead of MPI_Comm_rank here. There is no reason to
be using the MPI interface over the internal interface. This should clear
up these issues.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
CID 1047284 Uninitialized scalar variable (UNINIT)
CID 1047285 Uninitialized scalar variable (UNINIT)
CID 1047286 Uninitialized scalar variable (UNINIT)
If a performance variable session has no handles we should be returning MPI_SUCCESS
for MPI_T_pvar_start, MPI_T_pvar_stop, and MPI_T_pvar_reset. The code was returning
an unitialized value. This commit also updates the error code to return the proper
error on failure.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
CID 1295340 Unchecked return value (CHECKED_RETURN)
Check the return code of mca_base_framework_open. If the call fails for some reason
the component array will not be properly defined. This will cause issues in
mca_topo_base_find_available.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
CID 70630 Dereference before null check
Cleaned up useless goto statements and deleted NULL check. If
mca_base_select returns success than best_module and best_component
will always be non-NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Talked to @ggouaillardet about this code. It was not intended to be committed to
master. Removing to fix coverity issue.
CID 1270134 Unchecked return value
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit also adds protection against negative error codes in ompi
error code functions. There is one outstanding issue. There is a
negative MPI error code defined in mpi.h. This will need to be fixed
separetely.
This commit fixes coverity IDs 1271533 and 1270156.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The sargs array and its elements were malloced but not freed. Note
that strings passed to NewStringUTF are copied into Java's heap and it
is the callers responsibility to free the original string.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.
Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
This commit fixes some issues with the cuda support parameters. There
were a couple of duplicate registrations and an incorrect synonym (one
variable was made a synonym of mpi_preconnect_mpi).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created. Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.