1
1
The optimization that was introduced a year ago for saving a collective
synchronization  step for certain communicator creation functions has to be
disabled for now. The bug has been exposed by the hierarch module, but could
appear as well for inter-communicator creations. The problem is, that within a
communicator creation step we invoke a comm_dup (for intercomm_create) or
other collective operations (in case of hierarch) before all processes have
been synchronized. This lead to the "Dropped message for non-existant
communicators" error. This commit disables the optimization without removing
it from the code base. In theory, it can be enabled again as soon as we have
the unexpected message queues for unknown cid's, which were required if I
remember right anyway for the multi-threaded scenarios and potentially for
fault tolerance.

Before moving the patch to 1.3 I would like to let it soak for a couple of
days on trunk. Please note, taht my 2nd comment on ticket #1408 was
semi-correct, since the order of activation of the communicator and quering
the collective module have already been changed earlier.

This commit was SVN r19139.

The following Trac tickets were found above:
  Ticket 1408 --> https://svn.open-mpi.org/trac/ompi/ticket/1408
Этот коммит содержится в:
Edgar Gabriel 2008-08-04 14:55:09 +00:00
родитель 4712a73db5
Коммит 1adb3a6cda

Просмотреть файл

@ -9,7 +9,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007 University of Houston. All rights reserved.
* Copyright (c) 2007-2008 University of Houston. All rights reserved.
* Copyright (c) 2007 Cisco, Inc. All rights reserved.
* $COPYRIGHT$
*
@ -130,9 +130,9 @@ int ompi_comm_set ( ompi_communicator_t **ncomm,
}
newcomm->c_flags |= OMPI_COMM_INTER;
if ( OMPI_COMM_IS_INTRA(oldcomm) ) {
ompi_comm_dup(oldcomm, &newcomm->c_local_comm,1);
ompi_comm_dup(oldcomm, &newcomm->c_local_comm,0);
} else {
ompi_comm_dup(oldcomm->c_local_comm, &newcomm->c_local_comm,1);
ompi_comm_dup(oldcomm->c_local_comm, &newcomm->c_local_comm,0);
}
}
else {