Next: ALIGN and REALIGN
Up: Data Alignment and
Previous: Syntax of Data
The DISTRIBUTE directive specifies a mapping of data objects to abstract processors in a processor arrangement. For example, REAL SALAMI(10000) !HPF DISTRIBUTE WEISSWURST(BLOCK(256)) This specifies that groups of exactly 256 elements should be mapped to successive abstract processors. (There must be at least abstract processors if the directive is to be satisfied. The fortieth processor will contain a partial block of only 16 elements, namely WEISSWURST(9985:10000).)
HPF also provides a cyclic distribution format:
REAL DECK_OF_CARDS(52)
!HPF DISTRIBUTE CHESS_BOARD(BLOCK, BLOCK)
!HPF DISTRIBUTE dist-attribute-stuff :: distributee-list
is covered by syntax rule for a
combined-directive.
Examples:
!HPF DISTRIBUTE (BLOCK,*,BLOCK) ONTO SQUARE:: D2,D3,D4
The meanings of the alternatives for dist-format are given below.
Define the ceiling division function CD(J,K) = (J+K-1)/K (using
Fortran integer arithmetic with truncation toward zero.)
Define the ceiling remainder function CR(J,K) = J-K*CD(J,K).
The dimensions of a processor arrangement appearing as a dist-target are said to correspond in left-to-right order with
those dimensions of a distributee for which the corresponding
dist-format is not *. In the example above, processor
arrangement SQUARE must be two-dimensional; its first dimension
corresponds to the first dimensions of D2, D3, and D4
and its second dimension corresponds to the third dimensions of D2, D3, and D4.
Let be the size of a distributee in a certain dimension and
let be the size of the processor arrangement in the corresponding
dimension. For simplicity, assume all dimensions have a lower bound of
1. Then BLOCK() means that a distributee position
whose index along that dimension is is mapped to an abstract
processor whose index along the corresponding dimension of the
processor arrangement is CD(,) (note that must be true), and is position number +CR(,) among positions mapped to that abstract
processor. The first distributee position in abstract processor
along that axis is position number 1+*(-1).
BLOCK by definition means the same as BLOCK(CD(,)).
CYCLIC() means that a distributee position whose index
along that dimension is is mapped to an abstract processor whose
index along the corresponding dimension of the processor arrangement is
1+MODULO(CD(,)-1,). The first distributee
position in abstract processor along that axis is position number
1+*(-1).
CYCLIC by definition means the same as CYCLIC(1).
CYCLIC() and BLOCK() imply the same distribution
when , but BLOCK() additionally
asserts that the distribution will not wrap around in a cyclic manner,
which a compiler cannot determine at compile time if is not
constant. Note that CYCLIC and BLOCK (without argument
expressions) do not imply the same distribution unless ,
a degenerate case in which the block size is 1 and the distribution
does not wrap around.
Suppose that we have 16 abstract processors and an array of length 100: !HPF DISTRIBUTE CENTURY(BLOCK) ONTO SEDECIM
results in this mapping of array elements onto abstract processors:
Distributing the array BLOCK(8):
!HPF DISTRIBUTE CENTURY(CYCLIC) ONTO SEDECIM
results in this mapping of array elements onto abstract processors:
Distributing the array CYCLIC(3):
!HPF DISTRIBUTE distributee ( dist-format-list ) ONTO dist-target
is equivalent to
_=13_}
Next: ALIGN and REALIGN Up: Data Alignment and Previous: Syntax of Data