next up previous
Next: 5. Conclusion Up: The mZ Calculus and Previous: 3. A Strict Computation

Subsections


4. Implementation


The computation model of µZ implies the design of an abstract machine, called ZAM, whose implementation in C++ is currently in progress. The ZAM is a concurrent constraint resolution machine, which has some similarities with the machine described for Oz in [11]. The experimental state of the design of the ZAM is described below.


Values

Figure 3: Value Types of the ZAM
\begin{figure}
\vspace*{4pt}
\begin{verbatim}typedef ValueData const * Value; ty...
...a { virtual Thread spawn(..., int shift); }\end{verbatim}\vskip-5pt
\end{figure}

Fig. 3 shows the basic types used by the ZAM to represent values. There are four variants of values: constructed terms, variables (represented by an index), sets, and native values, such as numbers.

A set value, SetData(extens,intens), consists of two sorted sequences, one containing the extensional part and one containing the intensional part. Let vj be the values in extens, and ik be the intensions in intens, then a set value of the ZAM represents the µZ value $\Mzsingle{v_1}\mzcup\ldots\mzcup\Mzsingle{v_n}\mzcup i_1 \mzcup
\ldots \mzcup i_m$. This representation is well suited for implementing union and intersection of sets. For union, we just merge in linear time the extension and intension sequences of both sets. For intersection of two sets, SetData(extens1,intens1) and SetData(extens2,intens2), we build the set (in a pseudo notation)

   SetData(std::set_intersection(extens1,extens2), 
           {make_inten(extens1,intens2), 
            make_inten(extens2,intens1),
            make_inten(intens1,intens2)})
Here, make_inten(a,b) constructs a new intension which conceptually executes $\Mzinten{x}{x \mzin a \cand x \mzin b}$. Optimizations are possible if extensions and/or intensions of intersected sets are empty or singletons.

An intension (Fig. 3), describes the number of variables it allocates for its execution, its target value (representing the pattern in an µZ intension $\Mzinten{p}{\mzctr}$, a pointer to an array of values representing the environment the intension was constructed within, and an array of constraints. A constraint is specified by an abstract class, which just provides a method to spawn a thread executing this constraint (threads are explained later on). This abstraction allows the easy integration of specialized constraint implementations (for example, the intensions dynamically created with make_inten use constraints which have a non-standard way to spawn a thread).

When an intension is instantiated in the context of a goal, a block of variables is allocated in the variable table of the goal, and for each constraint a thread is spawn. The shift parameter of the spawn method defines the base index in the variable table for this instance of the intension. A thread obtains the actual index of a variable by adding to shift a constant offset.

The environment, env, of an intension contains bindings for variables which are free in the intension's constraints, and which are accessed by indexing from a constraint's thread. µZ's computation model demands that before an intension can be constructed, all such context variables are bound (cf. expression reduction rule % latex2html id marker 1520
\ensuremath{E\ref{rule:ESch}}), allowing the environment to be initialized at creation time of an intension. The situation is slightly different for fixed-point construction (cf. rule % latex2html id marker 1522
\ensuremath{E\ref{rule:EFix}}): during creation of an intension, the environment may contain holes, which are fixed after the intension has been created.


Threads and Goals

A thread (Fig. 4) executes the instructions from the instruction stream pointed to by the program counter pc. A goal is a collection of threads which together work on a set of variables, described by the field varTable, and a stack of choice points, described by the field choiceTable.

Figure 4: Control Types of the ZAM
\begin{figure}
\vspace*{4pt}
\begin{verbatim}typedef ThreadData * Thread; typede...
...rTable;
vector<ChoiceInfo> choiceTable; };\end{verbatim}\vskip-5pt
\end{figure}

A thread may be in one of several states, where the status MWAIT correspond to the rules % latex2html id marker 1524
\ensuremath{E\ref{rule:EMuE}}- % latex2html id marker 1526
\ensuremath{E\ref{rule:EMuF}} for executing the $\mzmu$-operator, and the status TWAIT to rules % latex2html id marker 1528
\ensuremath{C\ref{rule:CTstS}}- % latex2html id marker 1530
\ensuremath{C\ref{rule:CTstF}} for executing the test for emptiness/non-emptiness of sets. In theses states, a thread executes a subgoal, and is suspended until the subgoal has finished. Subgoal execution will not be detailed in this paper.

In status NORMAL, the thread is keen on executing its next instruction; in status WAIT it waits for the binding of a variable. A simple round-robin scheduler with exactly two priority levels selects the next thread to be executed in status NORMAL. The priorities are used for realizing the Andorra principal ([19]) which is central for the efficiency of the ZAM: if a thread needs to create a choice point, it lowers its priority, giving any threads which can deterministically continue under the current variable binding preference. Thus choice point creation is dynamically postponed as long as possible.

Assuming choice point scheduling, a rather simple model for dumping can be used, which does not requires to track data dependencies. Whenever a thread executes an instruction and has not yet dumped its state to the current most inner choice, it does so. The field dumpDepth contains the maximal choice depth a thread has been dumped to, and monotonically increases during thread's progress. The test for the need to dump is simply coded as goal->choiceTable.size() > thread->dumpDepth. This strategy would be hopeless inefficient if choice point scheduling would not be performed, since then threads would be backtracked arbitrarily, even if they do not depend on certain choice points. With choice point scheduling, however, independent computations are performed automatically before the choice point is created.

A choice point itself describes the thread which initiated the choice, some data which describes how to select the next alternative on backtrack, and trailing information. Trailing takes place for any actions performed during the time this choice is the innermost one: the field bound describes the set of variables which have been bound, field dumps saves the program counters of threads which made a step, and field enriched contains those threads which have been dynamically added (dynamic addition of threads appears when membership in an intension is executed; cf. resolution rule % latex2html id marker 1532
\ensuremath{C\ref{rule:CMemI}}). On backtrack, the variable bindings are released, the dumped thread's pc is restored, and the enriched threads are killed.

It is sufficient to just restore the program counter to backtrack threads since the ZAM uses a ``single-assignment'' register model for storing temporaries (field ThreadData::registers): an assignment to a register is never overwritten. Thus a thread can be restored to an arbitrary execution point, always finding a valid temporary state at this point3.


Instruction Set

Figure 5: Instruction Set of the ZAM
\begin{figure}
\vskip-6pt
\vspace*{-1ex}\begin{displaymath}\begin{array}{lllllll...
...mathsf{SUCCESS}
\end{array}\end{array}\end{displaymath}\vskip-10pt
\end{figure}

The ZAM uses a register transfer instruction language. The instructions are summarized in Fig. 5. The operands of instructions are the followings:

The following remarks on the instructions themselves:


Memory Management

A Boehm-style conservative garbage collector [1] is employed for memory retrieval. The collector needs to handle inferior pointers (references ``into'' a memory cell), because pointers to register regions may be stored in values, but also since the standard template library of C++ is employed, which may use local allocation strategies for containers. Nevertheless, the implementation of the garbage collector is compact (300 lines of code) and portable: this becomes possible since garbage collection can be invoked synchronously in-between the execution steps of threads, where no stack exists, and the only root to consider is the machine configuration which points to the topmost goal.


Performance

Since the compiler from µZ to ZAM code is not yet ready, performance measurements can be only very rudimentary. As a first guess, an ``assembler'' implementation of the list concatenation function app has been used:

\begin{displaymath}\begin{array}{lllllllllll}\Mzfix{app}{
\<\Mzschema{(\langle\...
...tminus t}{(xs,ys,t) \mzin app \mzcap
zs = x::t}} \>\end{array}\end{displaymath}

All solutions to app(xs,ys,c), where c is a constant list of length 128, have been queried in a loop executed thousand times (such that garbage collection needs to be invoked). The current implementation requires 16.4 seconds on a Pentium II/400, hence 16.4ms for one evaluation of app, backtracking over all possible 128 partitions. Around 2.9 million instruction steps are executed, which amounts to 177000 steps per second.

As a further test, a recursive definition of the fac function has been coded:

\begin{displaymath}\begin{array}{lllllllllll}\Mzfix{fac}{
\<\Mzschema{(x,y)}{x ...
...ansN{\Mzsingle{x-1}}{x}{(x,\_)})}
{(\_,y)}{y})}} \>\end{array}\end{displaymath}

For numbers and arithmetics, a native implementation is used (the ZAM has also some instructions for natives). Executing fac(8,y) thousand times is done in 1.6 seconds, requiring 303000 steps: thus one recursive function application as implemented with the $\mzmu$-operator requires around 0.2ms. In contrast, a highly efficient implementation of a functional language such as that of Opal [14] requires around 0.0014ms per recursive call - 142 times faster. This is not suprising, since the code the Opal compiler generates for fac is identical regarding efficiency to that of an according recursive C function - whereas function application via the $\mzmu$-operator deals with relations. Nevertheless, more effort has to be invested to get function application more efficient for special cases.

Taking into account that the ZAM is a virtual byte-code machine, these results are encouraging, though real benchmarking is required to get a clearer picture, and comparisons with the performance of other implementations of functional logic languages have to be made.



next up previous
Next: 5. Conclusion Up: The mZ Calculus and Previous: 3. A Strict Computation
Wolfgang Grieskamp