Example (using MPI_ORDER_C so the below has 6 rows of 4 ints to parcel out)
size = 4;
rank = 0;
ndims=2;
gsizes[0] = 6;
gsizes[1] = 4;
distribs[0] = MPI_DISTRIBUTE_CYCLIC;
distribs[1] = MPI_DISTRIBUTE_BLOCK;
dargs[0] = 2;
dargs[1] = 2;
psizes[0] = 2;
psizes[1] = 2;
MPI_Type_create_darray(size, rank, ndims,
gsizes, distribs, dargs, psizes,
MPI_ORDER_C, MPI_INT, &mydt);
Expectation for the layout:
inner dimension (1) is
4 items (ints) distributed block over 2 ranks with 2 items each
eg for rank 0: [ x x . . ]
outer dimension (0) is:
6 items (the above [ x x . .]) cyclic over 2 ranks with 2 items each
eg for rank 0:
[ x x . . ] : offset=0 bytes=8
[ x x . . ] : ofset=16 bytes=8
[ . . . . ]
[ . . . . ]
[ x x . . ] : offset=64 bytes=8
[ x x . . ] : offset=80 bytes=8
Or more specifically a stream of ints 0,1,2,3,4,5,6,7 sent into that
type should be
[ 0 1 . . ]
[ 2 3 . . ]
[ . . . . ]
[ . . . . ]
[ 4 5 . . ]
[ 6 7 . . ]
The data was laying out though as
[ 0 1 2 3 ]
[ . . . . ]
[ . . . . ]
[ . . . . ]
[ 4 5 6 7 ]
[ . . . . ]
because the recursive construction inside the block() function (which
creates the smaller row datatype [ x x . . ]) wasn't setting the extent
of that type.
Signed-off-by: Mark Allen <markalle@us.ibm.com>