From e8d7373b14909100cece81d21677cac4472acc27 Mon Sep 17 00:00:00 2001 From: Mike Dubman Date: Wed, 30 Sep 2015 12:23:23 +0300 Subject: [PATCH] COLL/FCA: revert to prev barrier if called from finalize FCA barrier may not complete if FCA progress is not called periodically. PMI/PMI2 API that can be used in rte barrier has no provision for calling external progress function. So it is possible that during finalize some ranks will be stuck in fca barrier while others are in PMI barrier. --- ompi/mca/coll/fca/coll_fca_ops.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/ompi/mca/coll/fca/coll_fca_ops.c b/ompi/mca/coll/fca/coll_fca_ops.c index 093bd46988..7d2711f15a 100644 --- a/ompi/mca/coll/fca/coll_fca_ops.c +++ b/ompi/mca/coll/fca/coll_fca_ops.c @@ -153,6 +153,10 @@ int mca_coll_fca_barrier(struct ompi_communicator_t *comm, int ret; FCA_VERBOSE(5,"Using FCA Barrier"); + if (OPAL_UNLIKELY(ompi_mpi_finalize_started)) { + FCA_VERBOSE(5, "In finalize, reverting to previous barrier"); + goto orig_barrier; + } ret = fca_do_barrier(fca_module->fca_comm); if (ret < 0) { if (ret == -EUSEMPI) {