187 строки
6.6 KiB
Plaintext
187 строки
6.6 KiB
Plaintext
|
.\" -*- nroff -*-
|
||
|
.\" Copyright (c) 2015 University of Houston. All rights reserved.
|
||
|
.\" Copyright (c) 2015 Mellanox Technologies, Inc.
|
||
|
.\" $COPYRIGHT$
|
||
|
.de Vb
|
||
|
.ft CW
|
||
|
.nf
|
||
|
..
|
||
|
.de Ve
|
||
|
.ft R
|
||
|
|
||
|
.fi
|
||
|
..
|
||
|
.TH "SHMEM\\_BROADCAST" "3" "#OMPI_DATE#" "#PACKAGE_VERSION#" "#PACKAGE_NAME#"
|
||
|
.SH NAME
|
||
|
|
||
|
\fIshmem_broadcast4\fP(3),
|
||
|
\fIshmem_broadcast8\fP(3),
|
||
|
\fIshmem_broadcast32\fP(3),
|
||
|
\fIshmem_broadcast64\fP(3)
|
||
|
\- Copy a data object from a designated PE to a target
|
||
|
location on all other PEs of the active set.
|
||
|
.SH SYNOPSIS
|
||
|
|
||
|
C or C++:
|
||
|
.Vb
|
||
|
#include <mpp/shmem.h>
|
||
|
|
||
|
void shmem_broadcast32(void *target, const void *source,
|
||
|
size_t nelems, int PE_root, int PE_start, int logPE_stride,
|
||
|
int PE_size, long *pSync);
|
||
|
|
||
|
void shmem_broadcast64(void *target, const void *source,
|
||
|
size_t nelems, int PE_root, int PE_start, int logPE_stride,
|
||
|
int PE_size, long *pSync);
|
||
|
.Ve
|
||
|
Fortran:
|
||
|
.Vb
|
||
|
INCLUDE "mpp/shmem.fh"
|
||
|
|
||
|
INTEGER nelems, PE_root, PE_start, logPE_stride, PE_size
|
||
|
INTEGER pSync(SHMEM_BCAST_SYNC_SIZE)
|
||
|
|
||
|
CALL SHMEM_BROADCAST4(target, source, nelems, PE_root,
|
||
|
& PE_start, logPE_stride, PE_size, fIpSync)
|
||
|
|
||
|
CALL SHMEM_BROADCAST8(target, source, nelems, PE_root,
|
||
|
& PE_start, logPE_stride, PE_size, pSync)
|
||
|
|
||
|
CALL SHMEM_BROADCAST32(target, source, nelems,
|
||
|
& PE_root, PE_start, logPE_stride, PE_size, pSync)
|
||
|
|
||
|
CALL SHMEM_BROADCAST64(target, source, nelems,
|
||
|
& PE_root, PE_start, logPE_stride, PE_size, pSync)
|
||
|
.Ve
|
||
|
.SH DESCRIPTION
|
||
|
|
||
|
The broadcast routines write the data at address source of the PE specified by
|
||
|
\fBPE_root\fP
|
||
|
to address \fBtarget\fP
|
||
|
on all other PEs in the active set. The active set of
|
||
|
PEs is defined by the triplet \fBPE_start\fP,
|
||
|
\fBlogPE_stride\fP
|
||
|
and \fBPE_size\fP\&.
|
||
|
The data is not copied to the target address on the PE specified by \fBPE_root\fP\&.
|
||
|
Before returning, the broadcast routines ensure that the elements of the pSync array are
|
||
|
restored to their initial values.
|
||
|
.PP
|
||
|
As with all SHMEM collective routines, each of these routines assumes that only PEs in the
|
||
|
active set call the routine. If a PE not in the active set calls a SHMEM collective routine,
|
||
|
undefined behavior results.
|
||
|
.PP
|
||
|
The arguments are as follows:
|
||
|
.TP
|
||
|
target
|
||
|
A symmetric data object with one of the following data types:
|
||
|
.RS
|
||
|
.TP
|
||
|
\fBshmem_broadcast8, shmem_broadcast64\fP: Any noncharacter type that
|
||
|
has an element size of 64 bits. No Fortran derived types or C/C++ structures are allowed.
|
||
|
.TP
|
||
|
\fBshmem_broadcast32\fP: Any noncharacter type that has an element size
|
||
|
of 32 bits. No Fortran derived types or C/C++ structures are allowed.
|
||
|
.TP
|
||
|
\fBshmem_broadcast4\fP: Any noncharacter type that has an element size
|
||
|
of 32 bits.
|
||
|
.RE
|
||
|
.RS
|
||
|
.PP
|
||
|
.RE
|
||
|
.TP
|
||
|
source
|
||
|
A symmetric data object that can be of any data type that is permissible for the
|
||
|
target argument.
|
||
|
.TP
|
||
|
nelems
|
||
|
The number of elements in source. For shmem_broadcast32 and
|
||
|
shmem_broadcast4, this is the number of 32\-bit halfwords. nelems must be of type integer.
|
||
|
If you are using Fortran, it must be a default integer value.
|
||
|
.TP
|
||
|
PE_root
|
||
|
Zero\-based ordinal of the PE, with respect to the active set, from which the
|
||
|
data is copied. Must be greater than or equal to 0 and less than PE_size. PE_root must be of
|
||
|
type integer. If you are using Fortran, it must be a default integer value.
|
||
|
.TP
|
||
|
PE_start
|
||
|
The lowest virtual PE number of the active set of PEs. PE_start must be of
|
||
|
type integer. If you are using Fortran, it must be a default integer value.
|
||
|
.TP
|
||
|
logPE_stride
|
||
|
The log (base 2) of the stride between consecutive virtual PE numbers in
|
||
|
the active set. log_PE_stride must be of type integer. If you are using Fortran, it must be a
|
||
|
default integer value.
|
||
|
.TP
|
||
|
PE_size
|
||
|
The number of PEs in the active set. PE_size must be of type integer. If you
|
||
|
are using Fortran, it must be a default integer value.
|
||
|
.PP
|
||
|
.TP
|
||
|
pSync
|
||
|
A symmetric work array. In C/C++, pSync must be of type long and size
|
||
|
_SHMEM_BCAST_SYNC_SIZE.
|
||
|
In Fortran, pSync must be of type integer and size SHMEM_BCAST_SYNC_SIZE. Every
|
||
|
element of this array must be initialized with the value _SHMEM_SYNC_VALUE (in C/C++)
|
||
|
or SHMEM_SYNC_VALUE (in Fortran) before any of the PEs in the active set enter
|
||
|
shmem_barrier().
|
||
|
.PP
|
||
|
The values of arguments PE_root, PE_start, logPE_stride, and PE_size must be equal on
|
||
|
all PEs in the active set. The same target and source data objects and the same pSync work
|
||
|
array must be passed to all PEs in the active set.
|
||
|
.PP
|
||
|
Before any PE calls a broadcast routine, you must ensure that the following conditions exist
|
||
|
(synchronization via a barrier or some other method is often needed to ensure this): The
|
||
|
pSync array on all PEs in the active set is not still in use from a prior call to a broadcast
|
||
|
routine. The target array on all PEs in the active set is ready to accept the broadcast data.
|
||
|
.PP
|
||
|
Upon return from a broadcast routine, the following are true for the local PE: If the current PE
|
||
|
is not the root PE, the target data object is updated. The values in the pSync array are
|
||
|
restored to the original values.
|
||
|
.SH NOTES
|
||
|
|
||
|
The terms collective and symmetric are defined in \fIintro_shmem\fP(3)\&.
|
||
|
.PP
|
||
|
All SHMEM broadcast routines restore pSync to its original contents. Multiple calls to SHMEM
|
||
|
routines that use the same pSync array do not require that pSync be reinitialized after the
|
||
|
first call.
|
||
|
.PP
|
||
|
You must ensure the that the pSync array is not being updated by any PE in the active set
|
||
|
while any of the PEs participates in processing of a SHMEM broadcast routine. Be careful to
|
||
|
avoid these situations: If the pSync array is initialized at run time, some type of
|
||
|
synchronization is needed to ensure that all PEs in the working set have initialized pSync
|
||
|
before any of them enter a SHMEM routine called with the pSync synchronization array. A
|
||
|
pSync array may be reused on a subsequent SHMEM broadcast routine only if none of the PEs
|
||
|
in the active set are still processing a prior SHMEM broadcast routine call that used the same
|
||
|
pSync array. In general, this can be ensured only by doing some type of synchronization.
|
||
|
However, in the special case of SHMEM routines being called with the same active set, you
|
||
|
can allocate two pSync arrays and alternate between them on successive calls.
|
||
|
.PP
|
||
|
.SH EXAMPLES
|
||
|
|
||
|
In the following examples, the call to shmem_broadcast64 copies source on PE 4 to target
|
||
|
on PEs 5, 6, and 7.
|
||
|
.PP
|
||
|
C/C++ example:
|
||
|
.Vb
|
||
|
for (i=0; i < _SHMEM_BCAST_SYNC_SIZE; i++) {
|
||
|
pSync[i] = _SHMEM_SYNC_VALUE;
|
||
|
}
|
||
|
shmem_barrier_all(); /* Wait for all PEs to initialize pSync */
|
||
|
shmem_broadcast64(target, source, nelems, 0, 4, 0, 4, pSync);
|
||
|
.Ve
|
||
|
Fortran example:
|
||
|
.Vb
|
||
|
INTEGER PSYNC(SHMEM_BCAST_SYNC_SIZE)
|
||
|
INTEGER TARGET, SOURCE, NELEMS, PE_ROOT, PE_START,
|
||
|
& LOGPE_STRIDE, PE_SIZE, PSYNC
|
||
|
COMMON /COM/ TARGET, SOURCE
|
||
|
DATA PSYNC /SHMEM_BCAST_SYNC_SIZE*SHMEM_SYNC_VALUE/
|
||
|
|
||
|
CALL SHMEM_BROADCAST64(TARGET, SOURCE, NELEMS, 0, 4, 0, 4,
|
||
|
& PSYNC)
|
||
|
.Ve
|
||
|
.PP
|
||
|
.SH SEE ALSO
|
||
|
|
||
|
\fIintro_shmem\fP(3)
|