Next:
Conventions for Calling Up: High Performance Fortran Language Previous: Coding Local Routines

Conventions for Local Subprograms

All HPF arrays accessible to an extrinsic procedure (arrays passed as arguments) are logically carved up into pieces; the local procedure executing on a particular physical processor sees an array containing just those elements of the global array that are mapped to that physical processor.

It is important not to confuse the extrinsic procedure, which is conceptually a single procedural entity called from the HPF program, with the local procedures, which are executed on each node, one apiece. An invocation of an extrinsic procedure results in a separate invocation of a local procedure on each processor. The execution of an extrinsic procedure consists of the concurrent execution of a local procedure on each executing processor. Each local procedure may terminate at any time by executing a RETURN statement. However, the extrinsic procedure as a whole terminates only after every local procedure has terminated; in effect, the processors are synchronized before return to a global HPF caller.

It is technically feasible to define extrinsic procedures in any other parallel language that maps to this basic SPMD execution model, or in any sequential language, including single-processor Fortran 90, with the understanding that one copy of the sequential code is executed on each processor. The extrinsic procedure interface is designed to ease implementation of local procedures in languages other than HPF; however, it is beyond the scope of the HPF specification or this annex to dictate implementation requirements for such languages or implementations. Nevertheless, a suggested way to use Fortran 90 to define local procedures is discussed in Section .

With the exception of returning from a local procedure to the global caller that initiated local execution, there is no implicit synchronization of the locally executing processors. A local procedure may use any control structure whatsoever. To access data outside the processor requires either preparatory communication to copy data into the processor before running the local code, or communication between the separately executing copies of the local procedure. Individual implementations may provide implementation-dependent means for communicating, for example through a message-passing library or a shared-memory mechanism. Such communication mechanisms are beyond the scope of this specification. Note, however, that many useful portable algorithms that require only independence of control structure can take advantage of local routines, without requiring a communication facility.

This model assumes only that array axes are mapped independently to axes of a rectangular processor grid, each array axis to at most one processor axis (no ``skew'' distributions) and no two array axes to the same processor axis. This restriction suffices to ensure that each physical processor contains a subset of array elements that can be locally arranged in a rectangular configuration. (Of course, to compute the global indices of an element given its local indices, or vice versa, may be quite a tangled computation-but it will be possible.)

It is recommended that if, in any given implementation, an interface kind does not obey the conventions described in the section, then the name of that interface kind should not end in ``_LOCAL''.




Next:
Conventions for Calling Up: High Performance Fortran Language Previous: Coding Local Routines


paula@erc.msstate.edu
Thu Dec 8 16:17:11 CST 1994