The mapping of a global HPF array to the physical processors places one
or more *blocks*, which are groups of elements with consecutive
indices, on each processor. The number of blocks mapped to a
processor is the product of the number of blocks of consecutive indices
in each dimension that are mapped to it. For example, a rank-one array
`X` with a `CYCLIC(4)` distribution will have blocks containing
four elements, except for a possible last block having elements. On the other hand, if `X` is first
aligned to a template or an array having a `CYCLIC(4)`
distribution, and a non-unit stride is employed (as is `!HPF$ ALIGN
X(I) WITH T(3*I)`), then its blocks may have fewer than four
elements. In this case, when the align stride is three and the
template has a block-cyclic distribution with four template elements
per block, the blocks of `X` have either one or two elements each.
If the align stride were five, then all blocks of `X` would have
exactly one element, as template blocks to which no array element is
aligned are not counted in the reckoning of numbers of blocks.

The portion of a global array argument associated with a dummy argument
in an HPF_LOCAL subprogram may be accessed in a block-by-block
fashion. Three of the local library routines, `LOCAL_BLKCNT`, `LOCAL_LINDEX`, and `LOCAL_UINDEX`, allow easy access to the local
storage of a particular block. Their use for this purpose is
illustrated by the following example, in which the local data are
initialized one block at a time:

EXTRINSIC(HPF_LOCAL) SUBROUTINE NEWKI_DONT_HEBLOCK(X) REAL X(:,:,:) INTEGER BL(3) INTEGER, ALLOCATABLE LIND1(:), LIND2(:), LIND3(:) INTEGER, ALLOCATABLE UIND1(:), UIND2(:), UIND3(:)

BL = LOCAL_BLKCNT(X)

ALLOCATE LIND1(BL(1)) ALLOCATE LIND2(BL(2)) ALLOCATE LIND3(BL(3))

ALLOCATE UIND1(BL(1)) ALLOCATE UIND2(BL(2)) ALLOCATE UIND3(BL(3))

LIND1 = LOCAL_LINDEX(X, DIM = 1) UIND1 = LOCAL_UINDEX(X, DIM = 1)

LIND2 = LOCAL_LINDEX(X, DIM = 2) UIND2 = LOCAL_UINDEX(X, DIM = 2)

LIND3 = LOCAL_LINDEX(X, DIM = 3) UIND3 = LOCAL_UINDEX(X, DIM = 3)

DO IB1 = 1, BL(1) DO IB2 = 1, BL(2) DO IB3 = 1, BL(3) FORALL (I1 = LIND1(IB1) : UIND1(IB1), & I2 = LIND2(IB2) : UIND2(IB2), & I3 = LIND3(IB3) : UIND3(IB3) ) & X(I1, I2, I3) = IB1 + 10*IB2 + 100*IB3 ENDDO ENDDO ENDDO END SUBROUTINE NEWKI_DONT_HEBLOCK new

Thu Dec 8 16:17:11 CST 1994