1
1

Fix for Ireduce + MPI_IN_PLACE.

Fixes a wrong answer from MPI_Ireduce when the red_sched_chain()
path was taken (which only happens for np<=4 and mesgsize>=64k).

The way libnbc treats MPI_IN_PLACE is to set sbuf == rbuf, and
whether an algorithm will work cleanly or not after that depends on the
details.

In this case the last steps of the algorithm amounted to
    (right neighbor is sending us reduction results from ranks 1..n-1)
    recv into rbuf from right neighbor
    add the contribution from our sbuf into rbuf
this would be fine in general, but if sbuf==rbuf, that recv overwrites
the sbuf. I changed it to recv into a tmpbuf if MPI_IN_PLACE was used.

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
Этот коммит содержится в:
Geoffrey Paulsen 2017-01-25 08:00:00 -08:00
родитель 4e06b96701
Коммит 045d0c5f4c

Просмотреть файл

@ -9,6 +9,7 @@
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
*
* Author(s): Torsten Hoefler <htor@cs.indiana.edu>
*
@ -427,9 +428,29 @@ static inline int red_sched_chain (int rank, int p, int root, const void *sendbu
/* last node does not recv */
if (vrank != p-1) {
if (vrank == 0) {
res = NBC_Sched_recv ((char *)recvbuf+offset, false, thiscount, datatype, rpeer, schedule, true);
if (sendbuf != recvbuf) {
// for regular src, recv into recvbuf
res = NBC_Sched_recv ((char *)recvbuf+offset, false, thiscount, datatype, rpeer, schedule, true);
} else {
// but for any-src, recv into tmpbuf
// because for any-src if we recved into recvbuf here we'd be
// overwriting our sendbuf, and we use it in the operation
// that happens further down
res = NBC_Sched_recv ((char *)offset, true, thiscount, datatype, rpeer, schedule, true);
}
} else {
res = NBC_Sched_recv ((char *) offset, true, thiscount, datatype, rpeer, schedule, true);
if (sendbuf != recvbuf) {
// for regular src, add sendbuf into recvbuf
// (here recvbuf holds the reduction from 1..n-1)
res = NBC_Sched_op ((char *) sendbuf + offset, false, (char *) recvbuf + offset, false,
thiscount, datatype, op, schedule, true);
} else {
// for any-src, add tmpbuf into recvbuf
// (here tmpbuf holds the reduction from 1..n-1) and
// recvbuf is our sendbuf
res = NBC_Sched_op ((char *) offset, true, (char *) recvbuf + offset, false,
thiscount, datatype, op, schedule, true);
}
}
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;