Update the docs on the actual algorithms used

This commit was SVN r7216.
2005-09-07 15:46:33 +00:00 · 2005-09-07 15:46:33 +00:00 · 881851604b
--- a/ompi/mca/coll/sm/coll_sm_barrier.c
+++ b/ompi/mca/coll/sm/coll_sm_barrier.c
@ -26,18 +26,26 @@
 /**
 * Shared memory barrier.
 *
- * Tree-based algorithm for a barrier -- the general scheme is a fan
- * in to rank 0 followed by a fan out using the control segments in
- * the shared memory area.  The data segments are not used.
+ * Tree-based algorithm for a barrier: a fan in to rank 0 followed by
+ * a fan out using the barrier segments in the shared memory area.
 *
- * The general algorithm is to wait for all N children to report in by
- * atomically increasing a uint32_t in my "in" control segment.  Once
- * that value equals N, I atomically increase the corresponding number
- * in my parent's "in" control segment.
+ * There are 2 sets of barrier buffers -- since there can only be, at
+ * most, 2 outstanding barriers at any time, there is no need for more
+ * than this.  The generalized in-use flags, control, and data
+ * segments are not used.
 *
- * If I have no parent and all N children have reported in, then I
- * write a 1 into each of my children's "out" control segments.  Once
- * the children see the 1, they do the same to their children.
+ * The general algorithm is for a given process to wait for its N
+ * children to fan in by monitoring a uint32_t in its barrier "in"
+ * buffer.  When this value reaches N (i.e., each of the children have
+ * atomically incremented the value), then the process atomically
+ * increases the uint32_t in its parent's "in" buffer.  Then the
+ * process waits for the parent to set a "1" in the process' "out"
+ * buffer.  Once this happens, the process writes a "1" in each of its
+ * children's "out" buffers, and returns.
+ *
+ * There's corner cases, of course, such as the root that has no
+ * parent, and the leaves that have no children.  But that's the
+ * general idea.
 */
 int mca_coll_sm_barrier_intra(struct ompi_communicator_t *comm)
 {
--- a/ompi/mca/coll/sm/coll_sm_bcast.c
+++ b/ompi/mca/coll/sm/coll_sm_bcast.c
@ -27,19 +27,26 @@
 /**
 * Shared memory broadcast.
 *
- * For the root, the general algorithm is to wait for the segment to
- * be available.  Once it is, it copies a fragment of the user's
- * buffer into the shared data segment and then write a 1 into its
- * childrens' "out" control buffers.  The process is repeated until
+ * For the root, the general algorithm is to wait for a set of
+ * segments to become available.  Once it is, the root claims the set
+ * by writing the current operation number and the number of processes
+ * using the set to the flag.  The root then loops over the set of
+ * segments; for each segment, it copies a fragment of the user's
+ * buffer into the shared data segment and then writes the data size
+ * into its childrens' control buffers.  The process is repeated until
 * all fragments have been written.
 *
- * For non-roots, they wait for a 1 to appear into their "out" control
- * buffers.  If they have children, they copy the data from their
- * parent's shared data segment into their shared data segment, and
- * write a 1 into each of its childrens' "out" control buffers.  They
- * then copy the data from their shared [local] data segment into the
- * user's buffer.  The process is repeated until all fragments have
- * been received.
+ * For non-roots, for each set of buffers, they wait until the current
+ * operation number appears in the in-use flag (i.e., written by the
+ * root).  Then for each segment, they wait for a nonzero to appear
+ * into their control buffers.  If they have children, they copy the
+ * data from their parent's shared data segment into their shared data
+ * segment, and write the data size into each of their childrens'
+ * control buffers.  They then copy the data from their shared [local]
+ * data segment into the user's output buffer.  The process is
+ * repeated until all fragments have been received.  If they do not
+ * have children, they copy the data directly from the parent's shared
+ * data segment into the user's output buffer.
 */
 int mca_coll_sm_bcast_intra(void *buff, int count, 
                            struct ompi_datatype_t *datatype, int root,