The DISTRIBUTE directive specifies a mapping of data objects
to abstract processors in a processor arrangement.
For example,
The block size may be specified explicitly:
HPF also provides a cyclic distribution format:
Distributions may be specified independently for each dimension of a
multidimensional array:
The REDISTRIBUTE directive is similar to the DISTRIBUTE
directive but is considered executable. An array (or template) may be
redistributed at any time, provided it has been declared DYNAMIC
(see Section 3.5). Any other arrays currently
ultimately aligned with an array (or template) when it is redistributed
are also remapped to reflect the new distribution, in such a way as to
preserve alignment relationships (see Section 3.4).
(This can require a lot of computational and communication effort at
run time; the programmer must take care when using this feature.)
The DISTRIBUTE directive may appear only in the specification-part
of a scoping unit. The REDISTRIBUTE directive may appear
only in the execution-part of a scoping unit. The principal
difference between DISTRIBUTE and REDISTRIBUTE is
that DISTRIBUTE must contain only a specification-expr as the
argument to a BLOCK or CYCLIC option, whereas in REDISTRIBUTE
such an argument may be any integer expression. Another difference is that
DISTRIBUTE is an attribute, and so can be combined with other attributes
as part of a combined-directive, whereas REDISTRIBUTE is not an attribute
(although a REDISTRIBUTE statement may be written in the
style of attributed syntax, using ``::'' punctuation).
Formally, the syntax of the DISTRIBUTE and
REDISTRIBUTE directives is:
Define the ceiling division function CD(J,K) = (J+K-1)/K (using
Fortran integer arithmetic with truncation toward zero.)
Define the ceiling remainder function CR(J,K) = J-K*CD(J,K).
The dimensions of a processor arrangement appearing as a dist-target are said to correspond in left-to-right order with
those dimensions of a distributee for which the corresponding
dist-format is not *. In the example above, processor
arrangement SQUARE must be two-dimensional; its first dimension
corresponds to the first dimensions of D2, D3, and D4
and its second dimension corresponds to the third dimensions of D2, D3, and D4.
Let d be the size of a distributee in a certain dimension and
let p be the size of the processor arrangement in the corresponding
dimension. For simplicity, assume all dimensions have a lower bound of
1. Then BLOCK(m) means that a distributee position
whose index along that dimension is j is mapped to an abstract
processor whose index along the corresponding dimension of the
processor arrangement is CD(j,m) (note that m X p >= d must be true), and is position number m+CR(j,m) among positions mapped to that abstract
processor. The first distributee position in abstract processor
k along that axis is position number 1+m*(k-1).
The block size m must be a positive integer.
BLOCK by definition means the same as BLOCK(CD(d,p)).
CYCLIC(m) means that a distributee position whose index
along that dimension is j is mapped to an abstract processor whose
index along the corresponding dimension of the processor arrangement is
1+MODULO(CD(j,m)-1,p). The first distributee
position in abstract processor k along that axis is position number
1+m*(k-1).
The block size \(m\) must be a positive integer.
CYCLIC by definition means the same as CYCLIC(1).
CYCLIC(m) and BLOCK(m) imply the same distribution
when m X p >= d, but BLOCK(m) additionally
asserts that the distribution will not wrap around in a cyclic manner,
which a compiler cannot determine at compile time if m is not
constant. Note that CYCLIC and BLOCK (without argument
expressions) do not imply the same distribution unless p >= d,
a degenerate case in which the block size is 1 and the distribution
does not wrap around.
Suppose that we have 16 abstract processors and an array of length 100:
Distributing the array BLOCK(8):
Distributing the array CYCLIC(3):
A DISTRIBUTE or REDISTRIBUTE directive must not cause
any data object associated with the distributee via storage association
(COMMON or EQUIVALENCE) to be mapped such that storage
units of a scalar data object are split across more than one abstract processor.
See Section for further discussion of storage association.
The statement form of a DISTRIBUTE or REDISTRIBUTE directive
may be considered an abbreviation for an attributed form that
happens to mention only one distributee; for example,
_=13_}
REAL SALAMI(10000)
!HPF$ DISTRIBUTE SALAMI(BLOCK)
specifies that the array SALAMI should be distributed across
some set of abstract processors by slicing it uniformly into blocks of
contiguous elements. If there are 50 processors, the directive
implies that the array should be divided into groups of 200 elements,
with SALAMI(1:200) mapped to the first processor,
SALAMI(201:400) mapped to the second processor, and so on.
If there is only one processor, the entire array is mapped to that processor
as a single block of 10000 elements.
REAL WEISSWURST(10000)
!HPF$ DISTRIBUTE WEISSWURST(BLOCK(256))
This specifies that groups of exactly 256 elements should be
mapped to successive abstract processors.
(There must be at least [10000/256] = 40 abstract processors
if the directive is to be satisfied. The fortieth processor
will contain a partial block of only 16 elements, namely
WEISSWURST(9985:10000).)
REAL DECK_OF_CARDS(52)
!HPF$ DISTRIBUTE CHESS_BOARD(BLOCK, BLOCK)
!HPF$ DISTRIBUTE DECK_OF_CARDS(CYCLIC)
If there are 4 abstract processors,
the first processor will contain DECK_OF_CARDS(1:49:4),
the second processor will contain DECK_OF_CARDS(2:50:4),
the third processor will contain DECK_OF_CARDS(3:51:4),
and the fourth processor will contain DECK_OF_CARDS(4:52:4).
Successive array elements are dealt out to successive abstract processors
in round-robin fashion.
INTEGER CHESS_BOARD(8,8), GO_BOARD(19,19)
!HPF$ DISTRIBUTE CHESS_BOARD(BLOCK, BLOCK)
!HPF$ DISTRIBUTE GO_BOARD(CYCLIC,*)
The CHESS_BOARD array will be carved up into contiguous
rectangular patches, which will be distributed onto a two-dimensional
arrangement of abstract processors. The GO_BOARD array will have its
rows distributed cyclically over a one-dimensional arrangement of
abstract processors. (The ``*'' specifies that GO_BOARD is not to
be distributed along its second axis; thus an entire row is to be distributed as one object. This is sometimes called ``on-processor''
distribution.)
H303 distribute-directive is DISTRIBUTE< i>distributee dist-directive-stuff
H304 redistribute-directive is REDISTRIBUTE distributee dist-directive-stuff
or REDISTRIBUTE dist-attribute-stuff :: distributee-list
H305 dist-directive-stuff is dist-format-clause [ dist-onto-clause ]
H306 dist-attribute-stuff is dist-directive-stuff
or dist-onto-clause
H307 distributee is object-name
or template-name
H308 dist-format-clause is ( dist-format-list )
or * ( dist-format-list )
or *
H309 dist-format is BLOCK [ ( int-expr ) ]
or CYCLIC [ ( int-expr ) ]
or *
H310 dist-onto-clause is ONTO dist-target
H311 dist-target is processors-name
or * processors-name
or *
Constraint: An object-name mentioned as a distributee
must be a simple name and not a subobject designator.
Constraint: An object-name mentioned as a distributee may not
appear as an alignee.
Constraint: An object-name} mentioned as a distributee may not
have the POINTER attribute.
Constraint: A distributee that appears in a REDISTRIBUTE
directive must have the DYNAMIC attribute (see Section
3.5).
Constraint: If a dist-format-list is specified, its length must
equal the rank of each distributee.
Constraint: If both a dist-format-list and a processors-name
appear, the number of elements of the dist-format-list
that are not ``*'' must equal the rank of the named
processor arrangement.
Constraint: If a processors-name appears but not a
dist-format-list, the rank of each distributee
must equal the rank of the named processor arrangement.
Constraint: If either the dist-format-clause or the dist-target
in a DISTRIBUTE directive begins with ``*'' then
every distributee must be a dummy argument.
Constraint: Neither the dist-format-clause nor the dist-target
in a REDISTRIBUTE may begin with ``*''.
Constraint: Any int-expr appearing in a dist-format of a
DISTRIBUTE directive must be a specification-expr.
Note that the possibility of a {\tt DISTRIBUTE} directive of the form
!HPF$ DISTRIBUTE dist-attribute-stuff :: distributee-list
is covered by syntax rule 301 for a
combined-directive.
Examples:
!HPF$ DISTRIBUTE D1(BLOCK)
!HPF$ DISTRIBUTE (BLOCK,*,BLOCK) ONTO SQUARE:: D2,D3,D4
The meanings of the alternatives for dist-format are given below.
!HPF$ PROCESSORS SEDECIM(16)
REAL CENTURY(100)
Distributing the array BLOCK (which in this case would mean
the same as BLOCK(7)):
!HPF$ DISTRIBUTE CENTURY(BLOCK) ONTO SEDECIM
results in this mapping of array elements onto abstract processors:
!HPF$ DISTRIBUTE CENTURY(CYCLIC) ONTO SEDECIM
results in this mapping of array elements onto abstract processors:
!HPF$ DISTRIBUTE CENTURY(BLOCK(256)) ONTO SEDECIM
results in having only one non-empty block-a partially-filled one at that,
having only 100 elements-on processor 1, with processors 2 through 16
having no elements of the array.
new
Next: ALIGN and REALIGN
Up: Data Alignment and
Previous: Syntax of Data