79006f4e5a
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com> |
||
---|---|---|
.. | ||
base.h | ||
coll_base_allgather.c | ||
coll_base_allgatherv.c | ||
coll_base_allreduce.c | ||
coll_base_alltoall.c | ||
coll_base_alltoallv.c | ||
coll_base_barrier.c | ||
coll_base_bcast.c | ||
coll_base_comm_select.c | ||
coll_base_comm_unselect.c | ||
coll_base_exscan.c | ||
coll_base_find_available.c | ||
coll_base_frame.c | ||
coll_base_functions.h | ||
coll_base_gather.c | ||
coll_base_reduce_scatter_block.c | ||
coll_base_reduce_scatter.c | ||
coll_base_reduce.c | ||
coll_base_scan.c | ||
coll_base_scatter.c | ||
coll_base_topo.c | ||
coll_base_topo.h | ||
coll_base_util.c | ||
coll_base_util.h | ||
coll_tags.h | ||
help-mca-coll-base.txt | ||
Makefile.am | ||
owner.txt | ||
README.memory_management |
/* This comment applies to all collectives (including the basic * module) where we allocate a temporary buffer. For the next few * lines of code, it's tremendously complicated how we decided that * this was the Right Thing to do. Sit back and enjoy. And prepare * to have your mind warped. :-) * * Recall some definitions (I always get these backwards, so I'm * going to put them here): * * extent: the length from the lower bound to the upper bound -- may * be considerably larger than the buffer required to hold the data * (or smaller! But it's easiest to think about when it's larger). * * true extent: the exact number of bytes required to hold the data * in the layout pattern in the datatype. * * For example, consider the following buffer (just talking about * true_lb, extent, and true extent -- extrapolate for true_ub: * * A B C * -------------------------------------------------------- * | | | * -------------------------------------------------------- * * There are multiple cases: * * 1. A is what we give to MPI_Send (and friends), and A is where * the data starts, and C is where the data ends. In this case: * * - extent: C-A * - true extent: C-A * - true_lb: 0 * * A C * -------------------------------------------------------- * | | * -------------------------------------------------------- * <=======================extent=========================> * <======================true extent=====================> * * 2. A is what we give to MPI_Send (and friends), B is where the * data starts, and C is where the data ends. In this case: * * - extent: C-A * - true extent: C-B * - true_lb: positive * * A B C * -------------------------------------------------------- * | | User buffer | * -------------------------------------------------------- * <=======================extent=========================> * <===============true extent=============> * * 3. B is what we give to MPI_Send (and friends), A is where the * data starts, and C is where the data ends. In this case: * * - extent: C-A * - true extent: C-A * - true_lb: negative * * A B C * -------------------------------------------------------- * | | User buffer | * -------------------------------------------------------- * <=======================extent=========================> * <======================true extent=====================> * * 4. MPI_BOTTOM is what we give to MPI_Send (and friends), B is * where the data starts, and C is where the data ends. In this * case: * * - extent: C-MPI_BOTTOM * - true extent: C-B * - true_lb: [potentially very large] positive * * MPI_BOTTOM B C * -------------------------------------------------------- * | | User buffer | * -------------------------------------------------------- * <=======================extent=========================> * <===============true extent=============> * * So in all cases, for a temporary buffer, all we need to malloc() * is a buffer of size true_extent. We therefore need to know two * pointer values: what value to give to MPI_Send (and friends) and * what value to give to free(), because they might not be the same. * * Clearly, what we give to free() is exactly what was returned from * malloc(). That part is easy. :-) * * What we give to MPI_Send (and friends) is a bit more complicated. * Let's take the 4 cases from above: * * 1. If A is what we give to MPI_Send and A is where the data * starts, then clearly we give to MPI_Send what we got back from * malloc(). * * 2. If B is what we get back from malloc, but we give A to * MPI_Send, then the buffer range [A,B) represents "dead space" * -- no data will be put there. So it's safe to give B-true_lb to * MPI_Send. More specifically, the true_lb is positive, so B-true_lb is * actually A. * * 3. If A is what we get back from malloc, and B is what we give to * MPI_Send, then the true_lb is negative, so A-true_lb will actually equal * B. * * 4. Although this seems like the weirdest case, it's actually * quite similar to case #2 -- the pointer we give to MPI_Send is * smaller than the pointer we got back from malloc(). * * Hence, in all cases, we give (return_from_malloc - true_lb) to MPI_Send. * * This works fine and dandy if we only have (count==1), which we * rarely do. ;-) So we really need to allocate (true_extent + * ((count - 1) * extent)) to get enough space for the rest. This may * be more than is necessary, but it's ok. * * Simple, no? :-) * */