The computation model of µZ implies the design of an abstract machine, called ZAM, whose implementation in C++ is currently in progress. The ZAM is a concurrent constraint resolution machine, which has some similarities with the machine described for Oz in [11]. The experimental state of the design of the ZAM is described below.
Fig. 3 shows the basic types used by the ZAM to represent values. There are four variants of values: constructed terms, variables (represented by an index), sets, and native values, such as numbers.
A set value, SetData(extens,intens), consists of two sorted
sequences, one containing the extensional part and one
containing the intensional part. Let vj be the values in
extens, and ik be the intensions in intens, then
a set value of the ZAM represents the µZ value
.
This representation is well suited for
implementing union and intersection of sets. For union, we just merge
in linear time the extension and intension sequences of both sets. For
intersection of two sets, SetData(extens1,intens1) and
SetData(extens2,intens2), we build the set (in a pseudo
notation)
SetData(std::set_intersection(extens1,extens2),
{make_inten(extens1,intens2),
make_inten(extens2,intens1),
make_inten(intens1,intens2)})
Here, make_inten(a,b) constructs a new intension
which conceptually executes
An intension (Fig. 3), describes the number of
variables it allocates for its execution, its target value
(representing the pattern in an µZ intension
,
a
pointer to an array of values representing the environment the
intension was constructed within, and an array of constraints. A
constraint is specified by an abstract class, which just provides a
method to spawn a thread executing this constraint
(threads are explained later on). This abstraction allows the easy
integration of specialized constraint implementations (for example,
the intensions dynamically created with make_inten use
constraints which have a non-standard way to spawn a thread).
When an intension is instantiated in the context of a goal, a block of
variables is allocated in the variable table of the goal, and for each
constraint a thread is spawn. The shift parameter of the
spawn method defines the base index in the variable table for this
instance of the intension. A thread obtains the actual index of a
variable by adding to shift a constant offset.
The environment, env, of an intension contains bindings for
variables which are free in the intension's constraints, and which are
accessed by indexing from a constraint's thread. µZ's computation
model demands that before an intension can be constructed, all such
context variables are bound (cf. expression reduction rule
), allowing the environment to be initialized at creation
time of an intension. The situation is slightly different for
fixed-point construction (cf. rule
): during creation of an
intension, the environment may contain holes, which are fixed
after the intension has been created.
A thread (Fig. 4) executes the instructions from the
instruction stream pointed to by the program counter pc. A goal
is a collection of threads which together work on a set of variables,
described by the field varTable, and a stack of choice points,
described by the field choiceTable.
A thread may be in one of several states, where the status
MWAIT correspond to the rules
-
for
executing the
-operator, and the status TWAIT to
rules
-
for executing the test for
emptiness/non-emptiness of sets. In theses states, a thread executes a
subgoal, and is suspended until the subgoal has
finished. Subgoal execution will not be detailed in this paper.
In status NORMAL, the thread is keen on executing its
next instruction; in status WAIT it waits for the binding
of a variable. A simple round-robin scheduler with exactly
two priority levels selects the next thread to be executed
in status NORMAL. The priorities are used for realizing
the Andorra principal ([19]) which is central for
the efficiency of the ZAM: if a thread needs to create
a choice point, it lowers its priority, giving any threads
which can deterministically continue under the current
variable binding preference. Thus choice point creation is
dynamically postponed as long as possible.
Assuming choice point scheduling, a rather simple model for
dumping can be used, which does not requires to track data
dependencies. Whenever a thread executes an instruction and has not
yet dumped its state to the current most inner choice, it does so.
The field dumpDepth contains the maximal choice depth a thread
has been dumped to, and monotonically increases during thread's progress.
The test for the need to dump is simply coded as
goal->choiceTable.size() > thread->dumpDepth. This strategy
would be hopeless inefficient if choice point scheduling would not be
performed, since then threads would be backtracked arbitrarily, even if
they do not depend on certain choice points. With choice point
scheduling, however, independent computations are
performed automatically before the choice point is created.
A choice point itself describes the thread which initiated the choice,
some data which describes how to select the next alternative on
backtrack, and trailing information. Trailing takes place for any
actions performed during the time this choice is the innermost one:
the field bound describes the set of variables which have been
bound, field dumps saves the program counters of threads which
made a step, and field enriched contains those threads which
have been dynamically added (dynamic addition of threads appears when
membership in an intension is executed; cf. resolution rule
). On backtrack, the variable bindings are
released, the dumped thread's pc is restored, and the enriched threads
are killed.
It is sufficient to just restore the program counter to backtrack
threads since the ZAM uses a ``single-assignment'' register model
for storing temporaries (field ThreadData::registers): an
assignment to a register is never overwritten. Thus a thread can be
restored to an arbitrary execution point, always finding a valid
temporary state at this point3.
The ZAM uses a register transfer instruction language.
The instructions are summarized in Fig. 5.
The operands of instructions are the followings:
ThreadData::registers of the executing thread.
GoalData::varTable field of the goal the executing thread
belongs to. If the variable is bound, its binding is denoted,
otherwise a VarData term holding the variable's index. The
executing thread adds the ThreadData::shift value to the
index, which determines the base of the variable frame of on
associated instantiation. If a variable is the destination of an
instruction, then the value of the instruction's result is
unified with the addressed variable's value.
ThreadData::env of the executing thread.
The following remarks on the instructions themselves:
IntenData::env,
just a pointer to the register array is stored. The IntenData::env; however, by the assertion of ``single
assignments'' to registers, the
A Boehm-style conservative garbage collector [1] is employed for memory retrieval. The collector needs to handle inferior pointers (references ``into'' a memory cell), because pointers to register regions may be stored in values, but also since the standard template library of C++ is employed, which may use local allocation strategies for containers. Nevertheless, the implementation of the garbage collector is compact (300 lines of code) and portable: this becomes possible since garbage collection can be invoked synchronously in-between the execution steps of threads, where no stack exists, and the only root to consider is the machine configuration which points to the topmost goal.
Since the compiler from µZ to ZAM code is not yet ready,
performance measurements can be only very rudimentary. As a first
guess, an ``assembler'' implementation of the list concatenation
function app has been used:
As a further test, a recursive definition of the fac function
has been coded:
Taking into account that the ZAM is a virtual byte-code machine, these results are encouraging, though real benchmarking is required to get a clearer picture, and comparisons with the performance of other implementations of functional logic languages have to be made.