tear down
HCOLL barrier may not complete if HCOLL progress is not called periodically.
which is the case in HCOLL teardown progress in the finalize.
(cherry picked from commit 793244d75dd94d1d5e0243bcccf6d04318750f3f)
This commit fixes a bad synchronization detection bug that occurs when
mixing MPI_Win_fence() and MPI_Win_lock(). If no communication has
occurred in the fence epoch it is safe to just clear the all_sync
object (it was set up by fence).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a bug that occurs when ranks are either not mapped
evenly or by something other than core.
Fixesopen-mpi/ompi#1599
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
If during the request completion callback we post another request that
completes right away (such a small send or a match for an unexpected
short message) we will try to complete the second request while holding
the lock for the completion of the first. For performance reasons
(mainly to avoid unlocking and locking the request mutex several times)
we have made the request lock recursive.
Fix CID 72362: Explicit null dereferenced (FORWARD_NULL)
From what I can tell the code @ fcoll_static_file_read_all.c:649
should be setting bytes_per_process[i] to 0 not bytes_per_process.
Fix CID 72361: Explicit null dereferenced (FORWARD_NULL)
Modified check to check for blocklen_per_process non-NULL before
trying to free blocklen_per_process[l]. This is sufficient because
free (NULL) is safe. Also cleaned up the initialization of this an a
couple other arrays. They were allocated with malloc() then
initialized to 0. Changed to used calloc().
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 72296: Resource leak (RESOURCE_LEAK):
Changed code to goto exit instead of returning to ensure memory is
freed.
Fix CID 712589: Out-of-bounds read (OVERRUN):
In this loop i and j are identical and always less than
iov_count. The CID was triggered because i was incremented if i was <
iov_count. This meant that if the loop did go on the next iteration
would access an invalid index.
Fix CID 741363: Uninitialized scalar variable (UNINIT):
Allocate tmp_len with calloc to insure every index is initialized.
Fix CID 741364: Uninitialized pointer read (UNINIT):
Allocate recv_types with calloc to ensure all indices are always
initialized. Also added a check to not loop and destroy if recv_types
is NULL.
Also added a NULL check on the allocation of decoded iov. This is not
the cause of CID 126784 but should be fixed.
Fix CID 712588: Out-of-bounds read (OVERRUN):
Similar to CID 712589. Should silence the issue.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Before this commit, a same PML tag may be used for distinct
communications for long messages. For example, consider a condition
where rank A calls ```MPI_PUT``` targeting rank B and rank B calls
```MPI_GET``` targeting rank A simultaneously.
A PML tag for the ```MPI_PUT``` is acquired on rank A and is used
for the long-message communication from rank A to rank B.
A PML tag for the ```MPI_GET``` is acquired on rank B and is used
for the long-message communication from rank A to rank B.
These two tags may become a same value because they are managed
independently on each rank. This will cause a data corruption.
This commit separates the tag used in a single RMA communication
call, one for communication from an origin to a target, and one
for communication from a target to an origin. A "base" tag
is acquired using ```get_tag``` function and PML tag is caluculated
from the base tag by ```tag_to_target``` and ```tag_to_origin```
function.
This commit ensures the bml is always enabled whether or not it will
be used. This ensures that any available btls communicate their modex
so that they can be used for one-sided communication.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Added mca parameter to turn progress thread on/off
Add a flag to check if we have btl progress thread.
Added macro for ob1 matching lock.
Update the AUTHORS file.
This commit adds code to detect when procs are unreachable when using
the dynamic add_procs functionality.
Fixes#1501
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Or at least that was the origin of the issue. It turns out
we were freeing the wrong buffer (but as it only happen in the
case of an error we never noticed).
This patch addresses most (if not all) @derbeyn concerns
expressed on #1015. I added checks for the requests allocation
in all functions, ompi_coll_base_free_reqs is called with the
right number of requests, I removed the unnecessary basic_module_comm_t
and use the base_module_comm_t instead, I remove all uses of the
COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes.
Fix CID 1315298: Resource leak (RESOURCE_LEAK) :
Fix CID 1315300: Resource leak (RESOURCE_LEAK):
Fix CID 1315299: Resource leak (RESOURCE_LEAK):
Fix CID 1315297 (#1 of 1): Resource leak (RESOURCE_LEAK):
Confirmed leaks in error paths. Added the leaked arrays to the
ERR_EXIT macro to ensure they are freed.
Fix CID 1315296 (#1 of 1): Resource leak (RESOURCE_LEAK):
Confirmed leak in error paths. Both the oversub and reqs arrays are
leaked. Free these arrays on error.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1269976 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269979 (#1 of 1): Unused value (UNUSED_VALUE):
Removed unused variables k_temp1 and k_temp2.
Fix CID 1269981 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269974 (#1 of 1): Unused value (UNUSED_VALUE):
Removed gotos and use the matched flags to decide whether to return.
Fix CID 715755 (#1 of 1): Dereference null return value (NULL_RETURNS):
This was also a leak. The items on cs->ctl_structures are allocated using OBJ_NEW so they mist be released using OBJ_RELEASE not OBJ_DESTRUCT. Replaced the loop with OPAL_LIST_DESTRUCT().
Fix CID 715776 (#1 of 1): Dereference before null check (REVERSE_INULL):
Rework error path to remove REVERSE_INULL. Also added a free to an error path where it was missing.
Fix CID 1196603 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Fix CID 1196601 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Both of these are false positives but it is still worthwhile to fix so they no longer appear. The loop conditional has been updated to use radix_mask_pow instead of radix_mask to quiet these issues.
Fix CID 1269804 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):
In general close (-1) is safe but coverity doesn’t like it. Reworked the error path for open to not try to close (-1).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 715744 (#1 of 1): Logically dead code (DEADCODE):
Fix CID 715745 (#1 of 1): Logically dead code (DEADCODE):
The free of scratch_num in either place is defensive programming. Instead of removing the free the conditional around the free has been removed to quiet the warning.
Fix CID 715753 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 715778 (#1 of 1): Dereference before null check (REVERSE_INULL):
Fixed the conditional to check for collective_alg != NULL instead of collective_alg->functions != NULL.
Fix CID 715749 (#1 of 4): Explicit null dereferenced (FORWARD_NULL):
Updated code to ensure that none of the parse functions are reached with a non-NULL value.
Fix CID 715746 (#1 of 1): Logically dead code (DEADCODE):
Removed dead code.
Fix CID 715768 (#1 of 1): Resource leak (RESOURCE_LEAK):
Fix CID 715769 (#2 of 2): Resource leak (RESOURCE_LEAK):
Fix CID 715772 (#1 of 1): Resource leak (RESOURCE_LEAK):
Move free calls to before error checks to cleanup leak in error paths.
Fix CID 741334 (#1 of 1): Explicit null dereferenced (FORWARD_NULL):
Added a check to ensure temp is not dereferenced if it is NULL.
Fix CID 1196605 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Fixed overflow in calculation by replacing int mask with 1ul.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1325868 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 1325869 (#1-2 of 2): Dereference after null check (FORWARD_NULL):
Here reqs can indeed be NULL. Added a check to
ompi_coll_base_free_reqs to prevent dereferencing NULL pointer.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1324726 (#1 of 1): Free of address-of expression (BAD_FREE):
Indeed, if a lock conflicts with the lock_all we will end up trying to
free an invalid pointer.
Fix CID 1328826 (#1 of 1): Dereference after null check (FORWARD_NULL):
This was intentional but it would be a good idea to check for
module->comm being non_NULL to be safe. Also cleaned out some checks
for NULL before free().
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:
- Before this change a significant portion of the rcache
functionality lived in mpool components. This meant that it was
impossible to add a new memory pool to use with rdma networks
(ugni, openib, etc) without duplicating the functionality of an
existing mpool component. All the registration functionality has
been removed from the mpool and placed in the rcache framework.
- All registration cache mpools components (udreg, grdma, gpusm,
rgpusm) have been changed to rcache components. rcaches are
allocated and released in the same way mpool components were.
- It is now valid to pass NULL as the resources argument when
creating an rcache. At this time the gpusm and rgpusm components
support this. All other rcache components require non-NULL
resources.
- A new mpool component has been added: hugepage. This component
supports huge page allocations on linux.
- Memory pools are now allocated using "hints". Each mpool component
is queried with the hints and returns a priority. The current hints
supported are NULL (uses posix_memalign/malloc), page_size=x (huge
page mpool), and mpool=x.
- The sm mpool has been moved to common/sm. This reflects that the sm
mpool is specialized and not meant for any general
allocations. This mpool may be moved back into the mpool framework
if there is any objection.
- The opal_free_list_init arguments have been updated. The unused0
argument is not used to pass in the registration cache module. The
mpool registration flags are now rcache registration flags.
- All components have been updated to make use of the new framework
interfaces.
As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix segfault due to mca_pml_ob1_cuda_need_buffers not handling the case of the
endpoint not being there. Calling mca_bml_get_endpoint() seems to fix the problem.
Fixesopen-mpi/ompi#1402
This commit removes the --with-mpi-thread-multiple option and forces
MPI_THREAD_MULTIPLE support. This cleans up an abstration violation
in opal where OMPI_ENABLE_THREAD_MULTIPLE determines whether the
opal_using_threads is meaningful. To reduce the performance hit on
MPI_THREAD_SINGLE programs an OPAL_UNLIKELY is used for the
check on opal_using_threads in OPAL_THREAD_* macros.
This commit does not clean up the arguments to the various functions
that take whether muti-threading support is enabled. That should be
done at a later time.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
converting an opal_process_name_t means the loss of one bit,
it was decided to restrict the local job id to 15 bits, so the
useful information of an opal_process_name_t can fit in 63 bits.
This commit fixes several bugs identified by @ggouaillardet and MTT:
- Fix SEGV in long send completion caused by missing update to the
request callback data.
- Add an MPI_Barrier to the fence short-cut. This fixes potential
semantic issues where messages may be received before fence is
reached.
- Ensure fragments are flushed when using request-based RMA. This
allows MPI_Test/MPI_Wait/etc to work as expected.
- Restore the tag space back to 16-bits. It was intended that the
space be expanded to 32-bits but the required change to the
fragment headers was not committed. The tag space may be expanded
in a later commit.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes several bugs identified by a new multi-threaded RMA
benchmarking suite. The following bugs have been identified and fixed:
- The code that signaled the actual start of an access epoch changed
the eager_send_active flag on a synchronization object without
holding the object's lock. This could cause another thread waiting
on eager sends to block indefinitely because the entirety of
ompi_osc_pt2pt_sync_expected could exectute between the check of
eager_send_active and the conditon wait of
ompi_osc_pt2pt_sync_wait.
- The bookkeeping of fragments could get screwed up when performing
long put/accumulate operations from different threads. This was
caused by the fragment flush code at the end of both put and
accumulate. This code was put in place to avoid sending a large
number of unexpected messages to a peer. To fix the bookkeeping
issue we now 1) wait for eager sends to be active before stating
any large isend's, and 2) keep track of the number of large isends
associated with a fragment. If the number of large isends reaches
32 the active fragment is flushed.
- Use atomics to update the large receive/send tag counters. This
prevents duplicate tags from being used. The tag space has also
been updated to use the entire 16-bits of the tag space.
These changes should also fixopen-mpi/ompi#1299.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds code to handle large unaligned gets. There are two
possible code paths for these transactions:
1) The remote region and local region have the same alignment. In
this case the get will be broken down into at most three get
transactions: 1 transaction to get the unaligned start of the region
(buffered), 1 transaction to get the aligned portion of the region,
and 1 transaction to get the end of the region.
2) The remote and local regions do not have the same alignment. This
should be an uncommon case and is not optimized. In this case a
buffer is allocated and registered locally to hold the aligned data
from the remote region. There may be cases where this fails (low
memory, can't register memory). Those conditions are unlikely and
will be handled later.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
If the state of the request is not set to OMPI_REQUEST_ACTIVE
then MPI_Test would immediately signal such request completed
while hcoll may still be working on it.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
If atomics are not globally visible (cpu and nic atomics do not mix)
then a btl endpoint must be used to access local ranks. To avoid
issues that are caused by having the same region registered with
multiple handles osc/rdma was updated to always use the handle for
rank 0. There was a bug in the update that caused osc/rdma to continue
using the local endpoint for accessing the state even though the
pointer/handle are not valid for that endpoint. This commit fixes the
bug.
Fixesopen-mpi/ompi#1241.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Optimizing put aggregation in the presence of threads will require a
redesign of the code. For now just ensure that put aggregation is
turned off when MPI_THREAD_MULTIPLE is enabled.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A bus error occurs in sm OSC under the following conditions.
- sparc64 or any other architectures which need strict alignment.
- `MPI_WIN_POST` or `MPI_WIN_START` is called for a window created
by sm OSC.
- The communicator size is odd and greater than 3.
The lines 283-285 in current `ompi/mca/osc/sm/osc_sm_component.c` has
the following code.
```c
module->global_state = (ompi_osc_sm_global_state_t *) (module->segment_base);
module->node_states = (ompi_osc_sm_node_state_t *) (module->global_state + 1);
module->posts[0] = (uint64_t *) (module->node_states + comm_size);
```
The size of `ompi_osc_sm_node_state_t` is multiples of 4 but not
multiples of 8. So if `comm_size` is odd, `module->posts[0]` does
not aligned to 8. This causes a bus error when accessing
`module->posts[i][j]`.
This patch fixes the alignment of `module->posts[0]` by setting
`module->posts[0]` first.
Update the configure logic for the new pmix120 component
ckpt
Get the pmix120 component to work - still not really registering or handling notifications, but infrastructure now operates
Cleanup some of the symbol scopes, and provide a more comprehensive rename.h file. Will pretty it up later - let's see how this works
Cleanup the rename files to use the pretty macros
When mtl-portals4 is configured for logical mapping, coll-portals4
must disqualify because it does not yet support logical mapping.
coll-portals4 looks for the endpoint pid to be zero which tells it
that mtl-portals4 is configured for logical mapping. This commit
initializes the endpoint nid/pid to zero for logical mapping.
NOTE: Building with external pmix *requires* that you also build with external libevent and hwloc libraries. Detect this at configure and error out with large message if this requirement is violated.
Closes#1204 (replaces it)
Fixes#1064
A previous commit updated the one-sided code to register the state
region only once. This created an issue when using the scratch lock
with fetching atomics. In this case on any rank that isn't local rank
0 the module->state_handle is NULL. This commit fixes the issue by
removing the scratch lock and using a fragment pointer instead.
Fixesopen-mpi/ompi#1290
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
accelerate message queue lookups if the lookups have
the proper tag&mask layout. OpenMPI should follow
PSM2's preferred tag&mask spec, so that PSM2 can provide
a performance benefit.
Remove send of the extra message. This bug hase triggered on
MPICH/coll/nbicbarrier test. In this test a series of communicators
are created.
This extre-message was reseived after original communicator was destroyed
and queued into non_existing_communicator_pending. When new completely
unrelated communicator with the same id as original was created this message
was pushed into the frags_cant_match queue and caused seq numbers skew and hang.
`MPI_WIN_TEST` must update the `flag` parameter to 0 when not all
origin processes called `MPI_WIN_COMPLETE`. But sm OSC doesn't.
If the caller initialize the `flag` argument to a non-0 value,
the caller will receive the non-0 `flag` value.
This commit adds a call to ompi_request_wait_completion for buffered
sends. Without this line it is possible to get into a state where the
data is never sent.
Fixesopen-mpi/ompi#1185
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
some of the collective modules. Added a new function
opan_datatype_span, to compute the memory span of
count number of datatype, excluding the gaps in the
beginning and at the end. If a memory allocation is
made using the returned value, the gap (also returned)
should be removed from the allocated pointer.
This commit changes the way ompi_proc_t's are retained/released by
ompi_group_t's. Before this change ompi_proc_t's were retained once
for the group and then once for each retain of a group. This method
adds unnecessary overhead (need to traverse the group list each time
the group is retained) and causes problems when using an async
add_procs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit removes two pieces of unneeded code from gather. First
it removes destroy_tree() calls from linear_top(), because the
linear algorithm does not create a tree, so there is no need to
destroy it. Second it removes unpack_bytes from the gather request
because it was calculated but never used.
There were two bugs in osc/rdma when using threads:
- Deadlock is ompi_osc_rdma_start_atomic. This occurs because
ompi_osc_rdma_frag_alloc is called with the module lock. To fix the
issue the module lock is now recursive. In the future I will add a
new lock to protect just the current rdma fragment.
- Do not drop the lock in ompi_osc_rdma_frag_alloc when calling
ompi_osc_rdma_frag_complete. Not only is it not needed but dropping
the lock at this point can cause a competing thread to mess up the
state.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
During component finalize, mtl-portals4 would blindly release
resources without testing if the handle was valid. This was OK,
but resource allocation is now delayed until add_procs(). If
mtl-portals4 is deselected, it will be finalized without
add_procs() ever being called. This commit ensures that invalid
handles are not released.
Writing to the pml_monitoring_flush variable will set the filename of
the output file.
Stopping a session for the pml_monitoring_flush will force the
generation of the nobitoring output file (as long as the filename
is not NULL).
To reset the monitoring, une has to bind the pml_monitoring_flush to a
session.
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"
Per Brice suggestion make all data count and message length be
uint64_t.
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
This commit fixes a bug that can occur on Cray Gemini networks. If
multiple registrations are used for the local state then we looks the
atomicity guarantees. To avoid issues like this use only a single
registration handle for all local state on a node.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes the following:
- CIDs 1328491, 1328492: Dead code caused by typos in a prior
commit.
- Fix the calculation of dynamic memory regions. This was causes
incorrect RMA range errors when accessing the last partial page of
an attachment.
- Fix a SEGV when using dynamic memory windows with local state (all
processes on the same node).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a bug that occurs when a post message comes in when
sending complete messages or while waiting for all outgoing messages
to flush. In that case the post message might get incorrecly
associated with the ending sync object.
References open-mpi/ompi#1012
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit add protection to the group, ob1, and bml endpoint lookup
code. For ob1 and the bml a lock has been added. For performance
reasons the lock is only held if a bml or ob1 endpoint does not
exist. ompi_group_dense_lookup no uses opal_atomic_cmpset to ensure
the proc is only retained by the thread that actually updates the
group.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes several bugs in the osc/rdma component:
- Complete aggregated requests immediately. Completion of RMA
requests indicates local completion anyway. This fixes a hang in
the c_reqops test.
- Correctly mark Rget_accumulate requests.
- Set the local base flag correctly on the local peer.
- Clear or set the no locks flag on the window if the value is
changed by MPI_Win_set_info.
- Actually update the target when using MPI_OP_REPLACE.
Fixesopen-mpi/ompi#1010
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit changes the priority of mtl components to be relative to
pml/ob1 and updates the mtl interface to expose this priority. cm now
sets its own priority based on the priority of the selected mtl
component.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The mca_base_select function uses returned priorities to select the
best component/module. This priority may be of use to the caller so
pass that information back in an optional argument. If the priority is
not needed pass NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit removes code that checks the ob1 priority vs the previous
priority. The previous priority is meaningless here and may only cause
ob1 to disable itself when it shouldn't.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This patch removes a priority check that disables cm if the previous
pml had higher priority. The check was incorrect as coded and is
unnecessary as we finalize all but one pml anyway.
Fixesopen-mpi/ompi#1035
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Once a FIN control message is appended to the pending list,
the ob1 PML attempts to send the FIN again in the `mca_pml_ob1_process_pending_packets` function.
But if the PML failed to sent the FIN again, the `mca_pml_ob1_send_fin`
function creates a new `mca_pml_ob1_pckt_pending_t` object and the
old object is not retured to the free list.
This commit fixes an issue identified by @rolfv. The local peer was
not being correctly initialized when running with a single process on
a node.
This fixesopen-mpi/ompi#1010
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds implementations of gather and igather using
Portals4 triggered operations. The default algorithm is linear,
but binomial can be selected using an MCA parameter -
coll_portals4_use_binomial_gather_algorithm.
This patch fixes the issues identified by @ggouaillardet in the IBM tests (collectives and topologies). It also improves the memory usage of OMPI, as a communicator without collective communications will never allocate the array of requests needed to coordinate the basic collective algorithms. This ticket replaced #790.
Fix code paths that didn't convert the MPI datatype to the
corresponding Portals4 datatype.
Thanks to Nicolas Chevalier (@shawone) for finding this bug and
submitting a patch.
in the base).
Correctly deal with persistent requests (they must be always freed when
they are stored in the request array associated with the communicator).
Always use MPI_STATUS_IGNORE for single request waiting functions.
The Portals4 get_peer family incorrectly cast the ompi_proc_t to
ptl_process_t and returned that as the peer. The ptl_process_t is
actually found in the endpoint array. This commit fixes the
Portals4 get_peer family to return the dereferenced endpoint
pointer.
FCA barrier may not complete if FCA progress is not called periodically.
PMI/PMI2 API that can be used in rte barrier has no provision for calling
external progress function.
So it is possible that during finalize some ranks will be stuck
in fca barrier while others are in PMI barrier.
Fixes CID 72320: Explicit NULL dereferenced
On error it is possible that the blocklen_per_process array is
NULL. Change the NULL check before the free to check for non-NULL on
the array not the array element. Also clean up allocation of this
array to use calloc instead of malloc + setting each element to NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fixes CIDs 72300, 72344, 1196764-1196768, 72300: Resource leaks
Mulitple allocated arrays are going out of scope at the end of
mca_fcoll_two_phase_file_write_all. Free these arrays. Also removed
the extraneous NULL checks since free (NULL) is safe in C.
Change returns to goto exit where the allocated resources are freed.
Fixes CIDs 72285-72292, 72297, 72298: Resource leaks
Change all appropriate return statements to goto exit to ensure that
all resources are freed. Also removed the NULL checks since free
(NULL) is safe in C.
Fixes CIDs 72295, 72296: Resource leaks
Moved free of requests and recv_types to after exit label. This will
ensure these are freed on error.
Also added a loop and statement to free send_buf which is going out of
scope at the end of the function.
Fixes CIDs 72336-72240, 735197, 735198: Resource leaks
Moved the exit label before to before the resources are released and
changed all appropriate return statements to goto exit. Also removed
extraneous NULL checks because free (NULL) is safe in C.
Fixes CIDs 72341, 72343, 1196805-1196809: Resource leaks
Free all resources after exit label and change return statements to
goto exit to ensure all resources are freed on error.
Fixes CID 1269973: Unused value
Check return code of ompi_request_wait_all. If it fails jump to the
exit.
Fixes CID 714119: Dereference before NULL check
Wrong value checked in conditional.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1315271: Constant expression result
The intent of this conditional is to not produce a peruse event for
probe or mprobe requests. Coverity is correct that the expression is
always true. Changed the || to && to fix. Also moved the conditional
within an OMPI_WANT_PERUSE to ensure the conditional is not evaluated
if peruse is disabled.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fixed CID 1269712, 1269709, 1269706, 1269703, 1269694: Logically dead code
Remove extra NULL check as OMPI_OSC_PT2PT_REQUEST_ALLOC can never set the
request to NULL.
Fixes CID 1269668: Unchecked return value
False positive. Add (void) to indicate we do not care about the return code
from opal_hash_table_get_uint32.
Fixes CID 1324726: Free of address-of expression
Do not free lock if it was not allocated.
Fixes CID 1269658: Free of address-of expression
Never will happen but because op is always a built-in op there is no
reason to retain/release it anyway.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
In the default mode of operation, the Portals4 components support
dynamic add_procs().
The Portals4 components have two alternate modes (flow control and
logical-to-physical) that require knowledge of all procs at startup.
In these modes, mtl-portals4 sets the MCA_MTL_BASE_FLAG_REQUIRE_WORLD
flag and btl-portals4 sets the MCA_BTL_FLAGS_SINGLE_ADD_PROCS flag
to tell the PML that we need all the procs in one add_procs() call.
This commit adds support to the pml, mtl, and btl frameworks for
components to indicate at runtime that they do not support the new
dynamic add_procs behavior. At the high end the lack of dynamic
add_procs support is signalled by the pml using the new pml_flags
member to the pml module structure. If the
MCA_PML_BASE_FLAG_REQUIRE_WORLD flag is set MPI_Init will generate the
ompi_proc_t array passed to add_proc from ompi_proc_world () instead
of ompi_proc_get_allocated ().
Both cm and ob1 have been updated to detect if the underlying mtl and
btl components support dynamic add_procs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds two new functions:
- ompi_proc_get_allocated - Returns all procs in the current job that
have already been allocated. This is used in init/finalize to
determine which procs to pass to add_procs/del_procs.
- ompi_proc_world_size - returns the number of processes in
MPI_COMM_WORLD. This may be removed in favor of callers just
looking at ompi_process_info.
The behavior of ompi_proc_world has been restored to return
ompi_proc_t's for all processes in the current job. The use of this
function is discouraged.
Code that was using ompi_proc_world() has been updated to make use of
the new functions to avoid the memory overhead of ompi_comm_world ().
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1196812: Resource Leak
dsts array was leaked on error.
Fix CID 710565: Copy-paste error
The line in question (nbc:513) is indeed a copy-paste error.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
ROMIO configure looks for lstat in wrong header
The ROMIO configure script checks for a declaration of lstat in
unistd.h, but, at least on the Linux machines I checked, lstat is in
sys/stat.h. (The detection failure led to a linker error when building
ROMIO as part of OpenMPI on one of my admittedly strangely configured
machines, somehow.) It appears from the man page that either location
is possible, so check both.
(cherry picked from mpich/mpich@7b8bd055df)
Signed-off-by: Rob Latham <robl@mcs.anl.gov>
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The add_procs change made some assumptions in the bml/r2 add_procs
wrong. This lead to del_procs never being called. I removed the logic
that checks the ompi_proc_t reference count and removed an unnecessary
allocation. The allocation only makes sense if we pass more than a
single proc at a time to the btl del_procs.
This commit also ensures that the btl del_procs is called if the
endpoint is in the btl_rdma array but not the btl_send array.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
The osc/sm component was using a simple counter to determine if all
expected posts had arrived to start a PSCW access epoch. This is
incorrect as a post may arrive from a peer that isn't part of the
current start group. There are many ways this could have been fixed.
This commit adds an n^2 bitmap. When a process posts it sets a bit in
the bitmap associated with the access rank to indicate the post is
complete. The access rank checks for and clears the bits associated
with all the processes in the start group.
The bitmap requires comm_size ^ 2 bits of space. This should be
managable as most nodes have relatively small numbers of processes. If
this changes another algorigthm can be implemented.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1324733: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324734: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324735: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324736: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324737: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324751: Memory - illegal accesses (USE_AFTER_FREE)
Fix CID 1324750: (USE_AFTER_FREE)
Fix CID 1324749: Memory - corruptions (USE_AFTER_FREE)
Fix CID 1324748: Memory - illegal accesses (USE_AFTER_FREE)
Fix CID 1324747: (USE_AFTER_FREE)
Fix CID 1324746: Memory - corruptions (USE_AFTER_FREE)
Add missing return on an error path.
Fix CID 1324745: Code maintainability issues (UNUSED_VALUE)
Ignore return code from barrier. It was not being used anyway.
Fix CID 1324738: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324741: Null pointer dereferences (REVERSE_INULL)
module->selected_btl can not be NULL in osc/rdma during normal
operation. Removed the unnecessary NULL check.
Fix CID 1324752: Memory - illegal accesses (USE_AFTER_FREE)
Move ompi_osc_pt2pt_module_lock_remove to before the lock is freed.
Fix CID 1324744: Uninitialized variables (UNINIT)
Fix CID 1324743: Uninitialized variables (UNINIT)
This array is not used unitialized but there is no reason not to use
calloc here to silence the warning.
The following CID is a false positive: 1324742. I will mark it such in
coverity.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds support for performing one-sided operations over
supported hardware (currently Infiniband and Cray Gemini/Aries). This
component is still undergoing active development.
Current features:
- Use network atomic operations (fadd, cswap) for implementing
locking and PSCW synchronization.
- Aggregate small contiguous puts.
- Reduced memory footprint by storing window data (pointer, keys,
etc) at the lowest rank on each node. The data is fetched as each
process needs to communicate with a new peer. This is a trade-off
between the performance of the first operation on a peer and the
memory utilization of a window.
TODO:
- Add support for the accumulate_ops info key. If it is known that
the same op or same op/no op is used it may be possible to use
hardware atomics for fetch-and-op and compare-and-swap.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit updates osc/pt2pt to allocate peer object as they are
needed rather than all at once. Additionally, to help improve the
memory footprint a new synchronization structure has been added.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>