next up previous
Next: 4. Analysis of Parallelism Up: Analysis of Parallelism in Previous: 2.2. Recursive Functions for

3. Polytypic Parallel Skeletons

We use polytypic parallel skeletons as our parallel model. It provides us a set of parallel structures to capture data parallelism of sequential programs.

Our model is composed of map, reduce, upward accumulation, downward accumulation and zip skeletons. The skeletons are polytypic[14] in that they are generically defined for general algebraic data types.

Our model does not force that argument functions for reduce and accumulation to be associative as in [14]. With this freedom, we can fully extract data parallelism from the divide-and-conquer nature of computations on tree-like structures. The associativity of functions can be fully utilized for the efficient implementation of parallel skeletons.

Given a recursive data structure as is defined in the previous section, parallel skeleton functions are generically defined as follows :

Because the implementation of parallel skeletons is not within the scope of our research, we only summarize well-known results. The parallel time complexity of $map~\bar{f}$ is given by $O(max(T(f_i))_{i=1}^m)$ with enough processors where $T(f_i)$ denotes the computing time for $f_i$. Reduce and accumulation are not so simple to implement, and many researchers applied the tree contraction algorithm for efficient implementation. The naive implementations of $reduce~\bar{f}$, $scan_{up}~\bar{f}$ and $scan_{down}~\bar{f}$ have the parallel time complexity of $O(max(T(f_i))_{i=1}^m \times ht)$ with enough processors where $ht$ is the height of the tree-like structure. The tree contraction algorithm can be applied if each argument function $f_i$ for each node type can be replaced with a set of functions on the binary tree representation with no additional time complexity where partial compositions of the new functions require constant time and space. Using the tree contraction algorithm, $reduce~\bar{f}$, $scan_{up}~\bar{f}$ and $scan_{down}~\bar{f}$ can be computed in $O(log~n \times max(T(f_i))_{i=1}^m)$ parallel time[22].


next up previous
Next: 4. Analysis of Parallelism Up: Analysis of Parallelism in Previous: 2.2. Recursive Functions for
Joonseon Ahn
1999-09-04