Lost in a Forest
Verification ·This is still a work in progress.
The title might be misleading. I’m not lost, nor in a forest 🙂. In this post I’ll share some work that I’ve been doing on specifying iteration.
Cursors
Iteration, be it the traversal of some data structure or the result
enumeration of some algorithm, can be conceived via small-step cursors.
This concept is well-known in popular high-level languages such as Java or
C++, the “for each” construct for (E x: ...)
is nothing short of
syntatic sugar for a cursor that might1 reach exhaustion via
controlled small-steps. Cursors are typically defined via 3 functions,
one of creation (create
), one that tests termination (has_next
), and one that
produces the next element in the iteration (next
).
c = create collection
while has_next c do
x <- next c
...
Reasoning about Iteration
Reasoning about iteration can be done using two predicates, permitted
that defines
the produced values and their well-formed sequence, and complete
that states
an iterator’s eventual completion. For instance, regarding the iteration over a
tree can be expressed via the permitted sequence of produces elements that is a prefix
of the sequence that order that the nodes will be visited. The order function is an oracle.
and similarly, the completeness of an iteration can be tested by comparing the length of the visited sequence and the number of nodes in the tree.
\[\texttt{completed}(v, \mathit{tree}) \triangleq ||v|| = \texttt{count}(\mathit{tree})\]Forests
Forests are n-ary trees, each tree
node can have as many child nodes as it may want. In OCaml, we can define
forests as a mutually recursive Algebraic Data-Type (ADT).
type 'a tree =
| E
| Node of 'a * 'a forest
and 'a forest = ('a tree) list
A tree
node is either E
mpty or some Node
that is a (value, list of children) pair.
Reasoning about mutually recursive ADTs is not trivial. Take for example a function that iterates
over a tree
and applies a function to each of its elements.
let rec iter_tree f tree =
match tree with
| E -> () (* do nothing *)
| Node (a, forest) ->
let _ = f a in (* apply function *)
iter_forest f forest (* visit its children *)
and iter_forest f forest =
match forest with
| [] -> () (* do nothing *)
| x :: t ->
iter_tree f x; (* tree recurse on current child node *)
iter_forest f t (* visit the rest of the list *)
It might not be so clear what our specification contracts should be, or better
yet what should be decorated with them. In other words, what should be the
contracts of iter_tree
and iter_forest
and how would they depend on
each other? We can take a step back and see the iteration as a whole, and not
as two independent recursive functions.
In that regard, let’s build a Cursor, in WhyML,
over our tree that is able to enumerate its
elements and allow us to better reason about its iteration. Firstly defining a
cursor, this can be expressed as a record with 3 fields, the sequence of visited
elements, the enumerated
elements and its initial collection
:
type cursor 'a = private {
ghost mutable visited : seq 'a;
ghost enumerated : seq 'a;
collection : tree 'a;
}
Noting that we end up doing a pre-order visit on the tree, we can define a function that enumerates the whole elements in that same order.
let rec function elements_tree (t: tree 'a) (acc: seq 'a) : seq 'a =
variant { t } (* t is strictly decreasing at each recursive call *)
match t with
| E -> acc
| Node e r -> elements_forest r (cons e acc)
end
with elements_forest (f: forest 'a) (acc: seq 'a) : seq 'a =
variant { f } (* f is strictly decreasing at each recursive call *)
match f with
| Nil -> acc
| Cons x r ->
let children = (elements_tree x acc) ++ acc in
elements_forest r children
end
We can use this function to populate the enumerate
sequence when we create the cursor:
val create (c: tree 'a) : cursor 'a
ensures { result.collection = c }
ensures { result.visited = empty}
ensures { result.enumerated = elements_tree c empty }
predicate permitted (c: cursor 'a) = forall e. mem e c.visited -> mem e c.elements
predicate complete (c: cursor 'a) = length c.elements = length c.visited
-
We can define iteration over infinite streams, in which case,
has_next
would always yieldtrue
. ↩