* Negative values are parameter errors for neighborhood collectives
- Add checks to the mpi/c interface `MPI_PARAM_CHECK`
* Fix a success check for neighbor_alltoallw with dist_graph
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
This commit fixes a number of threading issues discovered in
osc/pt2pt. This includes:
- Lock the synchronization object not the module in osc_pt2pt_start.
This fixes a race between the start function and processing post
messages.
- Always lock before calling cond_broadcast. Fixes a race between
the waiting thread and signaling thread.
- Make all atomically updated values volatile.
- Make the module lock recursive to protect against some deadlock
conditions. Will roll this back once the locks have been
re-designed.
- Mark incoming complete *after* completing an accumulate not
before. This was causing an incorrect answer under certain
conditions.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Fixes a wrong answer from MPI_Ireduce when the red_sched_chain()
path was taken (which only happens for np<=4 and mesgsize>=64k).
The way libnbc treats MPI_IN_PLACE is to set sbuf == rbuf, and
whether an algorithm will work cleanly or not after that depends on the
details.
In this case the last steps of the algorithm amounted to
(right neighbor is sending us reduction results from ranks 1..n-1)
recv into rbuf from right neighbor
add the contribution from our sbuf into rbuf
this would be fine in general, but if sbuf==rbuf, that recv overwrites
the sbuf. I changed it to recv into a tmpbuf if MPI_IN_PLACE was used.
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
Using MPI_MINLOC or MPI_MAXLOC with the following data types
leads to data corruption:
* MPI_DOUBLE_INT
* MPI_LONG_INT
* MPI_SHORT_INT
* MPI_LONG_DOUBLE_INT
Detect this print a error message and abort.
This workaround should be removed once the following issue is resolved:
* https://github.com/open-mpi/ompi/issues/1666
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
MPI_Allgatherv with MPI_IN_PLACE reads data from wrong location.
They were locating the MPI_IN_PLACE send buffer as
```c
send_buf = (char*)rbuf;
for (i = 0; i < rank; ++i) {
send_buf += ((ptrdiff_t)rcounts[i] * extent);
}
```
when it should be
```c
send_buf = (char*)rbuf;
send_buf += ((ptrdiff_t)disps[rank] * extent);
```
because disps[] specifies where things are in the v-style buffers.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
when a file is opened a second time for shared file pointer operations,
avoid setting the create and exclusive flag.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
it looks like disabling the lazy_open flag for sharedfp components
revealead a bug that lead to a crash in file_close in some tests. Make
sure the SHAREDFP_IS_SET flag is correctly set (and not overwritten again),
and we use that to avoid a double-free of the communicator.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Revert the logic of io_ompio_sharedfp_lazy_open. The user now has to explicitely
disable shared fp in order for the structures not to be allocated.
Otherwise, resetting the shared fp e.g. in case the file was opened
in append mode will not work correctly, the code could deadlock.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Fixes a bug reported on the mailing list. ompio did only reposition the individual
file pointer when the file was opened in append mode. Set the shared file
pointer also to point to the end of the file, similarly to the individual
file pointer.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
revert bits of open-mpi/ompi@cf534d0c95
we cannot del_procs here since the pml framework has already been closed
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:
* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
instead of the topology itself, if available, thus avoiding instantiating the topology
* openib BTL. This uses the distance matrix. At present, I haven't developed a method
for replacing that reference. Thus, this component will instantiate the topology
* usnic BTL. Uses the distance matrix.
* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
the topology
* ess base functions. If a process is direct launched and not bound at launch, this
code attempts to bind it. Thus, procs in this scenario will instantiate the
topology
Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.
Fix pernode binding policy
Properly handle the unbound case
Correct pointer usage
Do not free static error messages!
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
MPI_T_pvar_get_index was returning an incorrect index. The index
was never set correctly while registering the performance variables.
Additionally fix a missing case in the mca_base_var_type_t to MPI
datatype conversion. This type is currently used for control variables
registered by mxm, fca and hcoll components.
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>