next up previous contents
Next: The SUBSET Directive Up: Approved Extensions for Data and Previous: Approved Extensions for Data and

Active Processor Sets

 

Active processors are an extension of the idea of processors and processors arrangements as used in HPF 2.0. HPF 2.0 assumes that a (static) set of processors exists, and that the program uses these processors to store data (e.g., through the DISTRIBUTE directive) and perform computations (e.g., by execution of FORALL statements). Finer divisions of the processor set are seldom mentioned, although they do have uses (e.g., mapping onto processor subsets as in an approved extension, Section 8.7, or in explaining the performance of computations on subarrays). Features such as task parallelism, however, require considering a more dynamic set of processors. In particular, to answer the question ``What processor(s) is (are) currently executing?'' it is important to define these features.

Simply put, an active processor is one that executes an HPF statement (or group of statements). Active processors perform all operations required to execute the statement(s) except (perhaps) for the initial access of data and writing of results. Some operations require certain processors to be active, as described below, but for the most part any processor can be active in the execution of any statement. An HPF program begins execution with all processors active. As described in Section 9.2, the ON directive restricts the active processor set for the duration of execution of statements in its scope. Consider this simple example (which has a reasonably intuitive meaning):

!HPF$ ON HOME( Z(INDX) )
      X(INDX-1) = X(INDX-1) + Y(INDX) * Z(INDX+1)

Let X, Y, and Z have the same distribution, which does not replicate data. Following the ON directive, the statement would be executed as follows:

  1. The processor owning Z(INDX) is identified as the active processor. On different executions of this ON block, this may be a different processor.
  2. The values of X(INDX-1), Y(INDX), and Z(INDX+1) are made available to the active processor. Because of the identical distributions, Y(INDX) is already stored there. Depending on the data distribution and the hardware running the program, retrieving the others might correspond to the active processor loading registers from memory, or it might mean one or two other processors sending messages to the active processor.
  3. The active processor performs an addition and a multiplication, using the values sent in the last step.
  4. The result is stored to X(INDX-1), which may be on another processor. Again, this may require synchronization or other cross-processor operations.
There are considerable subtleties of this scheme when one of the statements involved is a function or subroutine call. Section 9.2.4 deals with these cases. Advice on the implementation of the ON directive is given in Section 9.2.2 below.

A few additional terms are useful in conjunction with the concept of active processors. If all processors in a set are active, then the set is called an active processor set. The set of all active processors is sometimes called the active processor set. This set is dynamic, and if a statement is executed repeatedly the active processor set may be different each time. In general, an HPF construct can only restrict the active set, not enlarge it. However, if the original active set is partitioned into several independent sets, all partitions may execute simultaneously. This is exactly how the TASK_REGION construct (described in Section 9.4) works.

The universal processor set is the set of all processors available to the HPF program. It is precisely the set of processors that is active when execution of the main program begins.

A processor that is not in the active set is called inactive. (Note that a processor may be inactive with respect to one statement, but active with respect to another. This is common in TASK_REGION constructs.)

It is sometimes necessary to query properties of the active processor set; this is accomplished by the approved extension intrinsics ACTIVE_NUM_PROCS and ACTIVE_PROCS_SHAPE described in Section 12.1.

The data mapped to a processor is said to be resident on it. A replicated object is resident on all of the processors that have a copy of it.




next up previous contents
Next: The SUBSET Directive Up: Approved Extensions for Data and Previous: Approved Extensions for Data and