Adds the new API hcoll_conetxt_free that resolves the issues
observed with the ctx cache and group_destroy_notify.
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
protect the mca_coll_libnbc_component.active_requests list with
the new mca_coll_libnbc_component.lock mutex.
Thanks Jie Hu for the report
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
- instead of coll_base_comm_get_reqs(2) for irecv/isend, use only
one request allocated in the stack and do a irecv/send
- instead of ompi_request_wait_all(2), simpy ompi_request_wait
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
* If (legal) non-uniform data type signatures are used in ibcast
then the chosen algorithm may fail on the request, and worst case
it could produce wrong answers.
* Add an MCA parameter that, by default, protects the user from this
scenario. If the user really wants to use it then they have to
'opt-in' by setting the following parameter to false:
- `-mca coll_libnbc_ibcast_skip_dt_decision f`
* Once the following Issues are resolved then this parameter can
be removed.
- https://github.com/open-mpi/ompi/issues/2256
- https://github.com/open-mpi/ompi/issues/1763
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Adds mapping of the MPI Fortran pair types (2INTEGER, 2REAL, 2DBLPREC)
to the corresponding hcoll dtypes.
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
* If an error is detected internal to libnbc (e.g., PML truncation error)
this patch makes sure that the request is completed and the `MPI_ERROR`
field is set approprately.
* Make an attempt to cleanup outstanding requests before returning.
- This is a "best attempt" since not all PMLs support canceling requests.
In order to optimize for MPI_IN_PLACE, data is sent from the receive buffer.
consequently, it should be sent with the receive type and count.
Thanks Josh Hursey for the report and test case
Refs open-mpi/ompi#2256
if sendbuf is equal to recvbuf, that should not be interpreted
as equivalent to MPI_IN_PLACE on the non root rank(s)
Thanks Valentin Petrov for the report
predefined datatypes such as MPI_LONG_DOUBLE_INT are not really contiguous,
so use span as returned by opal_datatype_span() instead of type extent,
otherwise data might be written above allocated memory.
Thanks Valentin Petrov for the report
Clang 5.1 on my mac was a sad panda compiling a couple
of files, complaining about uninitialized stack variables.
This commit makes clang a happier panda (or at least not so sad).
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Based on current implementation it is faster to use a blocking
send than the non-blocking version. Switch the exchange function
used in the barrier to use the blocking version combined with
the non-blocking version of the receive.
This is similar to open-mpi/ompi@223d75595d