In functional languages, data parallelism is expressed with the following two components.
For example, in NESL[5], parallel data collections are vectors and parallel skeletons are apply-to-each and a set of built-in parallel functions such as scan and permute. In BMF Model[21], parallel data collections are lists and parallel skeletons are map and reduce.
This paradigm enables efficient and portable parallel programming by encouraging programmers to build the parallel programs from ready-made skeletons for which efficient implementations are known to exist. Taking the form of higher-order functions, the parallel skeletons can be very familiar building blocks to functional language programmers.
However, data parallel programming is still not so easy. In functional languages, data collections such as lists or trees are declared recursively and applications on such data structures are naturally understood and specified using recursion. Extracting predefined forms of data parallelism from such applications and integrating them into efficient parallel programs are not simple.
For example, the following is the sequential program which numbers every node of a tree in in-order. Because the size of the left subtree is required to number the right subtree, inorder function returns not only the numbered tree but also the size of the input tree. Because the same function inorder is applied to each node of a tree using recursion, there can be valuable data parallelism in this program. But, because of dependencies among recursions, it is not straightforward to develop an efficient data parallel program for in-order numbering.
datatype tree = Leaf of int | Node of int * tree * tree
inorder (Leaf x, n) = (Leaf n,1)
| (Node(x,lt,rt), n) =
let
(t1,s1) = inorder(lt,n);
(t2,s2) = inorder(rt,1+n+s1)
in
(Node(n+s1,t1,t2),s1+s2+1)
end
In this paper, we propose a method to extract data parallelism from recursive functions. Our research has the following aspects.
Our research is presented as follows. In Section 2, we formally define the form of recursive functions to be parallelized. In Section 3, parallel skeletons which are used in object parallel programs are defined. In Section 4, we present an analysis method to classify subexpressions of a function body based on their conditions of parallel execution. In section 5, the form of object parallel program is presented and the parallelization process is sketched with an example. Finally we discuss related works and conclude in Section 6.