The osc/sm component was using a simple counter to determine if all
expected posts had arrived to start a PSCW access epoch. This is
incorrect as a post may arrive from a peer that isn't part of the
current start group. There are many ways this could have been fixed.
This commit adds an n^2 bitmap. When a process posts it sets a bit in
the bitmap associated with the access rank to indicate the post is
complete. The access rank checks for and clears the bits associated
with all the processes in the start group.
The bitmap requires comm_size ^ 2 bits of space. This should be
managable as most nodes have relatively small numbers of processes. If
this changes another algorigthm can be implemented.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fix CID 1324733: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324734: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324735: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324736: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324737: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324751: Memory - illegal accesses (USE_AFTER_FREE)
Fix CID 1324750: (USE_AFTER_FREE)
Fix CID 1324749: Memory - corruptions (USE_AFTER_FREE)
Fix CID 1324748: Memory - illegal accesses (USE_AFTER_FREE)
Fix CID 1324747: (USE_AFTER_FREE)
Fix CID 1324746: Memory - corruptions (USE_AFTER_FREE)
Add missing return on an error path.
Fix CID 1324745: Code maintainability issues (UNUSED_VALUE)
Ignore return code from barrier. It was not being used anyway.
Fix CID 1324738: Null pointer dereferences (FORWARD_NULL)
Fix CID 1324741: Null pointer dereferences (REVERSE_INULL)
module->selected_btl can not be NULL in osc/rdma during normal
operation. Removed the unnecessary NULL check.
Fix CID 1324752: Memory - illegal accesses (USE_AFTER_FREE)
Move ompi_osc_pt2pt_module_lock_remove to before the lock is freed.
Fix CID 1324744: Uninitialized variables (UNINIT)
Fix CID 1324743: Uninitialized variables (UNINIT)
This array is not used unitialized but there is no reason not to use
calloc here to silence the warning.
The following CID is a false positive: 1324742. I will mark it such in
coverity.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds man pages for the MPI_Win_allocate and MPI_Win_allocated_shared
MPI-3 functions. The man page for MPI_Win_create has also been updated to
indicate support for the same_size and same_disp_unit info keys
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
In the v1.10 opal_progress_thread emulation, ensure to create a
blocking libevent timer far in the future. Without that,
opal_event_loop() will return immediately (and therefore the progress
thread spins hard, stealing CPU cycles).
We try to keep the source code the same between master and v1.10. So
put the #if's back for OPAL_HAVE_HWLOC (and just hard-code it to 1 on
master) so that this code is also compilable in v1.10.
ompi has new mpi_add_procs_cutoff argument that can control
creation of ompi_proc_t but We should be confident that all
ompi_proc_t object exists during oshmem_group_all creation.
Probably it could be done in more flexible way later.
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
Most functionality of oshmem_proc duplicates ompi_proc. In addition
to that, Current logic does not allow to do oshmem initialization
w/o ompi startup.
So this refactoring allows to avoid code duplication, decrease used
memory and make oshmem support easier.
Now oshmem_proc is transparent ompi_proc structure, that can be
extended by oshmem specific data.
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>