** Next:** Aggregate Communication Examples
**Up:** The HPF Model
** Previous:** The HPF Model

The following examples illustrate the communication requirements of scalar assignment statements. The purpose is to illustrate the implications of data distribution specifications on communication requirements for parallel execution. The explanations given do not necessarily reflect the actual compilation process.

Consider the following statements:
REAL a(1000), b(1000), c(1000), x(500), y(0:501)
INTEGER inx(1000)
!HPF DISTRIBUTE (BLOCK) ONTO procs :: a, b, inx
!HPF ALIGN x(i) WITH y(i+1)
...
a(i) = b(i) ! Assignment 1
x(i) = y(i+1) ! Assignment 2
a(i) = c(i) ! Assignment 3
a(i) = a(i-1) + a(i) + a(i+1) ! Assignment 4
c(i) = c(i-1) + c(i) + c(i+1) ! Assignment 5
x(i) = y(i) ! Assignment 6
a(i) = a(inx(i)) + b(inx(i)) ! Assignment 7
In this example, the `PROCESSORS` directive specifies a linear
arrangement of 10 processors. The `DISTRIBUTE` directives
recommend to the compiler that the arrays `a`, `b`, and `inx` should be distributed among the 10 processors with blocks of 100
contiguous elements per processor. The array `c` is to be
cyclically distributed among the processors with `c(1)`, `c(11)`, ..., `c(991)` mapped onto processor `procs(1)`;
`c(2)`, `c(12)`, ..., `c(992)` mapped onto processor
`procs(2)`; and so on. The complete mapping of arrays `x` and
`y` onto the processors is not specified, but their relative
alignment is indicated by the `ALIGN` directive. The `ALIGN`
statement causes `x(i)` and `y(i+1)` to be stored on the same
processor for all values of `i`, regardless of the actual
distribution chosen by the compiler for `x` and `y` (`y(0)`
and `y(1)` are not aligned with any element of `x`). The `PROCESSORS`, `DISTRIBUTE`, and `ALIGN` directives are discussed
in detail in Section .

`
Although Assignment 3 ( a(i) = c(i)) looks very similar to the
first assignment, the communication requirements are very different due
to the different distributions of a and c. Array elements
a(i) and c(i) are mapped to the same processor for only
10%of the possible values of i. (This can be seen by
inspecting the definitions of BLOCK and CYCLIC in
Section `

`
REAL a(1000), b(1000), c(1000)
!HPF DISTRIBUTE (CYCLIC) ONTO procs :: a, b, c
...
a(i) = b(i+2) ! Statement 1
b(i) = c(i+3) ! Statement 2
b(i+2) = 2 * a(i+2) ! Statement 3
c(i) = a(i+1) + b(i+2) + c(i+3) ! Statement 4
Statements 1 and 2 each require one array element to be communicated
for any value of i. Statement 3 has no inherent communication.
To simplify the discussion, assume that all four statements are
executed on the processor storing the array element being assigned.
Then, for Statement 4:
`

`Element``a(i+1)`induces communication, since it is not local and was not communicated earlier;`Element``b(i+2)`induces communication, since it is nonlocal and has changed since its last use; and`Element``c(i+3)`*does not*induce new communication, since it was used in statement 2 and not changed since.Thus, the minimum total inherent communication in this program fragment is four array elements. It is important to note that this is a minimum. Some compilation strategies may produce communication for element

`c(i+3)`in the last statement.

**Next:**Aggregate Communication Examples**Up:**The HPF Model**Previous:**The HPF Model*paula@erc.msstate.edu*

Thu Jul 21 17:05:43 CDT 1994