Next: Elemental Reference of Up: EXECUTE-ON-HOME and LOCAL-ACCESS Previous: Example 3
The original program for this example is due to Michael Wolfe of Oregon Graduate Institute.
This program performs matrix multiplication by a systolic algorithm. Note that without the EXECUTE-ON-HOME and LOCAL_ACCESS directive, the compiler will have a hard time detecting that all A, B and C accesses are actually local.
REAL A(N,N), B(N,N), C(N,N)
PARAMETER(NOP = NUMBER_OF_PROCESSORS()) !HPF TEMPLATE T(2*N,N) ! to allow wrap around mapping !HPF ALIGN B(I,J) WITH T(N+I,J) !HPF REALIGN B(I,J) WITH T(N-IT*IB+I,J)
! data parallel loop !HPF EXECUTE (IP) ON_HOME T(IP*IB+1,1), LOCAL_ACCESS A, B, C
DO IP = 0, NOP-1 ITP = MOD( IT+IP, NOP ) DO I = 1, IB DO J = 1, N DO K = 1, IB C(IP*IB+I,J) = C(IP*IB+I,J) + 1 A(IP*IB+I,ITP*IB+K)*B(ITP*IB+K,J) ENDDO ! K ENDDO ! J ENDDO ! I ENDDO ! IP