4d00c59b2e
some of the collective modules. Added a new function opan_datatype_span, to compute the memory span of count number of datatype, excluding the gaps in the beginning and at the end. If a memory allocation is made using the returned value, the gap (also returned) should be removed from the allocated pointer.
125 строки
5.4 KiB
Plaintext
125 строки
5.4 KiB
Plaintext
/* This comment applies to all collectives (including the basic
|
|
* module) where we allocate a temporary buffer. For the next few
|
|
* lines of code, it's tremendously complicated how we decided that
|
|
* this was the Right Thing to do. Sit back and enjoy. And prepare
|
|
* to have your mind warped. :-)
|
|
*
|
|
* Recall some definitions (I always get these backwards, so I'm
|
|
* going to put them here):
|
|
*
|
|
* extent: the length from the lower bound to the upper bound -- may
|
|
* be considerably larger than the buffer required to hold the data
|
|
* (or smaller! But it's easiest to think about when it's larger).
|
|
*
|
|
* true extent: the exact number of bytes required to hold the data
|
|
* in the layout pattern in the datatype.
|
|
*
|
|
* For example, consider the following buffer (just talking about
|
|
* true_lb, extent, and true extent -- extrapolate for true_ub:
|
|
*
|
|
* A B C
|
|
* --------------------------------------------------------
|
|
* | | |
|
|
* --------------------------------------------------------
|
|
*
|
|
* There are multiple cases:
|
|
*
|
|
* 1. A is what we give to MPI_Send (and friends), and A is where
|
|
* the data starts, and C is where the data ends. In this case:
|
|
*
|
|
* - extent: C-A
|
|
* - true extent: C-A
|
|
* - true_lb: 0
|
|
*
|
|
* A C
|
|
* --------------------------------------------------------
|
|
* | |
|
|
* --------------------------------------------------------
|
|
* <=======================extent=========================>
|
|
* <======================true extent=====================>
|
|
*
|
|
* 2. A is what we give to MPI_Send (and friends), B is where the
|
|
* data starts, and C is where the data ends. In this case:
|
|
*
|
|
* - extent: C-A
|
|
* - true extent: C-B
|
|
* - true_lb: positive
|
|
*
|
|
* A B C
|
|
* --------------------------------------------------------
|
|
* | | User buffer |
|
|
* --------------------------------------------------------
|
|
* <=======================extent=========================>
|
|
* <===============true extent=============>
|
|
*
|
|
* 3. B is what we give to MPI_Send (and friends), A is where the
|
|
* data starts, and C is where the data ends. In this case:
|
|
*
|
|
* - extent: C-A
|
|
* - true extent: C-A
|
|
* - true_lb: negative
|
|
*
|
|
* A B C
|
|
* --------------------------------------------------------
|
|
* | | User buffer |
|
|
* --------------------------------------------------------
|
|
* <=======================extent=========================>
|
|
* <======================true extent=====================>
|
|
*
|
|
* 4. MPI_BOTTOM is what we give to MPI_Send (and friends), B is
|
|
* where the data starts, and C is where the data ends. In this
|
|
* case:
|
|
*
|
|
* - extent: C-MPI_BOTTOM
|
|
* - true extent: C-B
|
|
* - true_lb: [potentially very large] positive
|
|
*
|
|
* MPI_BOTTOM B C
|
|
* --------------------------------------------------------
|
|
* | | User buffer |
|
|
* --------------------------------------------------------
|
|
* <=======================extent=========================>
|
|
* <===============true extent=============>
|
|
*
|
|
* So in all cases, for a temporary buffer, all we need to malloc()
|
|
* is a buffer of size true_extent. We therefore need to know two
|
|
* pointer values: what value to give to MPI_Send (and friends) and
|
|
* what value to give to free(), because they might not be the same.
|
|
*
|
|
* Clearly, what we give to free() is exactly what was returned from
|
|
* malloc(). That part is easy. :-)
|
|
*
|
|
* What we give to MPI_Send (and friends) is a bit more complicated.
|
|
* Let's take the 4 cases from above:
|
|
*
|
|
* 1. If A is what we give to MPI_Send and A is where the data
|
|
* starts, then clearly we give to MPI_Send what we got back from
|
|
* malloc().
|
|
*
|
|
* 2. If B is what we get back from malloc, but we give A to
|
|
* MPI_Send, then the buffer range [A,B) represents "dead space"
|
|
* -- no data will be put there. So it's safe to give B-true_lb to
|
|
* MPI_Send. More specifically, the true_lb is positive, so B-true_lb is
|
|
* actually A.
|
|
*
|
|
* 3. If A is what we get back from malloc, and B is what we give to
|
|
* MPI_Send, then the true_lb is negative, so A-true_lb will actually equal
|
|
* B.
|
|
*
|
|
* 4. Although this seems like the weirdest case, it's actually
|
|
* quite similar to case #2 -- the pointer we give to MPI_Send is
|
|
* smaller than the pointer we got back from malloc().
|
|
*
|
|
* Hence, in all cases, we give (return_from_malloc - true_lb) to MPI_Send.
|
|
*
|
|
* This works fine and dandy if we only have (count==1), which we
|
|
* rarely do. ;-) So we really need to allocate (true_extent +
|
|
* ((count - 1) * extent)) to get enough space for the rest. This may
|
|
* be more than is necessary, but it's ok.
|
|
*
|
|
* Simple, no? :-)
|
|
*
|
|
*/
|
|
|
|
|