William E. Byrd, University of Utah, USA
Michael Ballantyne, University of Utah, USA
Gregory Rosenblatt, Toronto, Ontario, Canada
Matthew Might, University of Utah, USA
ICFP 2017 ()
We present seven programming challenges in Racket, and an elegant, unified approach to solving them using constraint logic programming in miniKanren.
This pearl shows how a single unusual technique can solve a variety of programming problems. Each problem is presented as a challenge to the reader, and we invite you to develop your own idea of how each problem might be solved before reading ours. If you'd like to consider the challenges with a frame of mind uncolored by our approach, feel free to jump ahead to Section 1. If you'd like a sneak peek, read on.
We solve each challenge by using an interpreter for a subset of Racket, or for a slight variant of Racket. These interpreters are written as relations in the constraint logic programming language miniKanren. Each interpreter relates two arguments: an expression
expr to be evaluated, and the value
val to which
These arguments may be only partially complete, containing logic variables representing unknown parts. Our miniKanren implementation will use constraint solving and search to fill in the logic variables with expressions and values consistent with the semantics of Racket.
By placing logic variables representing unknowns in different positions within the
val arguments we can express a wide range of interesting queries, which provides the flexibility needed to solve the variety of challenges we pose.
We also show that this technique works especially well for reasoning about programs written in a functional programming language such as Racket.
The problem in section 2 is taken directly from Byrd et al. (2012), and the problem from section 1 is a simple variant of an example in that work. We include these problems to provide background and understanding for the problems in sections 3 through 7, which are all novel.
In his Valentine's Day blog post, "99 ways to say
'(I love you) in Racket," one of the authors [Might 2013] writes:
In spite of their simplicity, lists often confound new Racket programmers. After lists, the many forms for expressing a computation in Racket take time to discover and then master. To address both of these points, I've created 99 roughly tweet-sized expressions in Racket that all evaluate to the list
'(I love you).
Racket prints the list value
(I love you) as the quotation expression
'(I love you), and Might follows this convention. In this paper we show just the value. To replicate our results in Racket, evaluate
(print-as-expression #f) before the examples.
These examples, such as the three below, introduce students to an assortment of Racket's features.
Inspired by a love of Racket, and a love of the list
(I love you), we want to go even further.
Come up with 99,000 Racket expressions that evaluate to the list
(I love you), demonstrating a variety of expression types in the spirit of the blog post.
Stop and think: How would you solve Challenge 1?
We could follow Might's approach, and write the 99,000 Racket expressions by hand.
However, even we don't love the list
(I love you) that much.
There must be a better way...
The semantics of a programming language can be defined by an evaluation relation, which relates programs to their behaviors.
For example, an evaluation relation for Racket relates the expression
(let ((x 5)) (+ x x)) with the value
An interpreter for Racket expressions is an implementation of the evaluation relation for a special case: the expression to be evaluated is known, and the value is to be derived.
What if instead of writing a normal interpreter, we implemented the evaluation relation in a constraint logic programming language, designed to query relations with any arguments unknown? We could solve our problem by simply leaving the expression argument as an unknown and specifying the value argument to be the list
(I love you).
To understand this approach we must first learn a little about how to write relations in miniKanren,
an embedded domain specific language for constraint logic programming [Byrd and Friedman 2006; Friedman et al. 2005].
Let's take list concatenation as our example. Here's how we might define
append in Racket:
We name results of intermediate expressions, in order to make our next transformation more clear.
append function can now be translated into the three-argument
appendo relation in miniKanren:
We'll explain the miniKanren language forms used in this definition, but first let's take a look at what
appendo can do.
append function can be used to concatenate two lists:
appendo relation can also be used the same way. In order to use the relation, we construct a query using the
run* form. A query consists of query variables and calls to relations that will constrain those variables. Using search and constraint solving, miniKanren attempts to find values for the query variables that satisfy the constraints imposed by the relations. We emulate the behavior of
append by calling
appendo with concrete lists for the first two arguments, and the query variable
q representing an unknown third argument:
More excitingly, we can use
appendo to infer the list
q that, when prepended to the list
(d e), produces
(a b c d e):
These examples treating list concatenation as a relation will be familiar to any Prolog programmer.
A query needn't produce only a single answer. We can also infer all pairs of lists
y that when appended together produce
(a b c d e)
Now that we've seen the added expressivity we get from writing
appendo as a relation, let's consider its implementation. The definition of
appendo uses the three core logical operators from miniKanren:
conde. These operators construct goals representing logical assertions, which may succeed or fail.
(== t1 t2) is a goal which asserts that
t2 have the same value.
(== 5 5) succeeds, while
(== 5 6) fails.
== is implemented using first-order unification, essentially a bidirectional form of pattern matching.
Unification operates on terms.
A miniKanren term is either the empty list, a Boolean constant, a number, a symbol, a logic variable, or a pair of terms.
Logic variables start out fresh—that is, they initially have no value—and may later obtain a value through calls to
For example, assuming
x is a fresh logic variable,
(== 5 x) succeeds, and associates
x with the value 5.
A subsequent call
(== 6 x) would fail, since
x would already be associated with 5.
As mentioned, we can apply
== to terms that are pairs—for example, if
y are fresh variables, the goal
(== (cons 3 4) (cons x y)) would succeed, and associate
For succinctness we often write pairs and lists using Racket's
The single characters
, are shorthand for these forms, respectively.
All of these expressions are equivalent:
If the variable
x were bound to
5, each expression above would evaluate to
(3 (5 . y) 4).
(fresh (x* ...) g g* ...) introduces lexically scoped, fresh logic variables
fresh also performs logical conjunction (`and') over the goals in its body,
g g* ..., forming a new goal asserting that conjunction.
(fresh (y z) (== 5 y) (== 6 z) (== y z))
first introduces logic variables
z, then performs a conjunction of the three calls to
(== 5 y) and
(== 6 z) succeed, but
(== y z) fails (since
z are associated with
6, which differ), causing the entire
fresh expression to fail.
(conde (g0 g0* ...) (g1 g1* ...) ...) constructs a goal that performs logical disjunction (`or') over its clauses.
Each clause acts as a conjunction over the goals it contains.
conde expression is
(conde ((== 3 x) (== 4 y)) ((== 5 x)))
with the two clauses
((== 3 x) (== 4 y)) and
((== 5 x)).
The first clause is a conjunction of
(== 3 x) and
(== 4 y).
In this case both clauses succeed: the first clause associates
4; the second clause produces a distinct answer that associates
5 but doesn't constrain
Since both clauses succeed, the entire
conde expression succeeds twice, and can produce two answers.
Looking at the definition of
appendo, we see its body is a
conde expression with two clauses. The first clause represents the base case, corresponding to the first clause of the
cond expression. The goal
(== '() l) corresponds to the
null? check, while
(== s ls) corresponds to returning the value of
ls representing the appended result of
s. The second clause corresponds to the second clause of the
cond. Having named the intermediate results in
append, we can see how the unifications in this clause correspond to taking the pair
l apart, and constructing a return value using the result of a recursive invocation of
appendo. Because logical assertions can be made in any order, we are able to place the recursive invocation last in this clause. This ordering is preferable, since encountering a failing unification early can make it unnecessary to perform the recursion.
Let's also revisit our query:
run* introduces the query variable
q, then evaluates its body as a conjunction of goals, returning a list of all the answers it produces. Under the hood, miniKanren uses a search that traverses the tree of conjunctions and disjunctions constructed by
Each answer consists of a list of values unified with query variables, followed by any additional side conditions introduced by lazy constraints. In this case, there is only a single answer unifying a single query variable to a value that happens to be a list, resulting in a triply-nested list. Here the single answer found unifies
(a b c d e). If we wanted miniKanren to stop searching after producing the first answer rather than continuing until it can prove there are no more, we could use
run 1 rather than
As we just saw, by treating list concatenation as a relation instead of a function we get additional, useful behavior. This is also true of more complicated programs, such as interpreters.
Imagine an interpreter for a subset of Racket, written as a miniKanren relation.
This interpreter—which we might call
evalo—would take two arguments: an expression
e to be evaluated, and the value
v of that expression.
((5)), indicating the Racket expression
((lambda (x) x) 5) evaluates to the value
The good news is that
evalo actually exists! We've written an interpreter for a small—but interesting and Turing-complete—subset of Racket, including first-class multi-argument and variadic functions,
letrec for defining recursive definitions, and a simple pattern matcher.
Other than the pattern matcher, the language supported by
evalo is also a subset of Scheme.
evalo is inspired by, but significantly extends, the relational interpreter given in Byrd et al. (2012).
The even better news is that we can use
evalo to generate expressions that evaluate to
(I love you)!
Stop and think:
Construct a miniKanren query that uses
evalo to solve Challenge 1.
Let's start off small, by generating just a single expression that evaluates to
(I love you):
Success! The expression
'(I love you) evaluates to the list
(I love you). This expression isn't very exciting, though. Let's try to find more interesting answers...
Emulating the blog post, we can replace
run 1 with
(I love you):
Here is a slightly more complex answer:
sym side-condition tells us that
_.0 in the expression
((lambda _.0 _.0) 'I 'love 'you) can be replaced by any legal Racket identifier (represented in our interpreter as a Racket symbol).
However, it's not actually necessary to replace
_.0 with another identifier, since
_.0 is itself a valid identifier in Racket:
This answer is especially interesting because
(lambda x x) is a variadic function—that is, a function that takes any number of arguments—which returns the arguments passed to it as a list.
((lambda x x) 'I 'love 'you) is equivalent to the expression
(list 'I 'love 'you).
Here is one more interesting answer:
The side condition
(num _.0) indicates that
_.0 in the expression
(if _.0 '(I love you) _.1) represents an arbitrary numeric constant.
_.0 with a specific number, such as
42, yields the expression
(if 42 '(I love you) _.1).
42 is considered a true value in Racket, the `else' arm of the
if expression is never evaluated.
_.1 may be replaced by any legal Racket expression, and the overall expression will still evaluate to
(I love you).
There's something important happening in miniKanren's implementation to make these queries work. We can think of all the ways of constructing an expression as forming an infinite tree, branching at every point where a different kind of subexpression could be selected. The execution of the
evalo relational interpreter searches through this tree for expressions that evaluate to
(I love you).
Stop and think: Does it matter what search strategy miniKanren uses?
In fact, the search strategy matters a great deal. Searching an infinite tree requires care: for example, depth-first search could get stuck considering an infinite subtree that doesn't contain answers to our query. Even when depth-first search does find answers in an infinite tree, it might find only one kind of answer rather than the variety of answers we're looking for. Our example programs might then only use
car, but never
lambda. Thus it is important that miniKanren uses a complete search, which interleaves the processes of searching different parts of the tree [Kiselyov et al. 2005]. Our interleaving search also biases towards branches which have shown progress in the form of intermediate results, rewarding them with an increased share of the search effort. This bias allows miniKanren's search to investigate promising branches much more deeply than would unbiased complete searches like breadth-first and iterative deepening depth-first search.
Now it's time to shift our query into overdrive:
Here are two artisanal, free-range
(I love you) expressions, hand-curated from the 99,000 answers:
Both answers include a
num side-condition, which we've replaced with the number
42 so that the expressions will run without error in Racket.
In his classic paper, `A micro-manual for LISP – Not the whole
truth,' John McCarthy describes the rules for a LISP
value, and then offers this challenge [McCarthy 1978]:
Difficult mathematical type exercise: Find a list e such that
value e = e.
Solve McCarthy's exercise by finding a list
e that evaluates to itself in Racket.
Stop and think: How would you solve Challenge 1?
We could try to construct such a list by hand, either through trial-and-error or through cleverness.
If we were of a more theoretical bent, we could use Kleene's second recursion theorem [Kleene 1952] to construct
We adopt a different, more direct approach, originally presented by Byrd et al. (2012).
We can once again use
evalo, the relational interpreter we describe in section 1.2.
Stop and think:
Write a miniKanren query that uses
evalo to find an expression
e that evaluates to itself.
It's actually very simple to write such a query—just use the query variable as both the first and second arguments to
evalo (representing the ``input'' expression and ``output'' value, respectively):
And, indeed, this query produces an expression that evaluates to itself:
This answer represents any expression
_.0 such that
_.0 is a number (represented by the
(num _.0) side-condition). That is, the answer represents any number.
Expressions that evaluate to themselves have been named quines by Hofstadter .
The single answer returned by our query represents infinitely many individual quines, since any numeric literal is trivially a quine in Racket—for example, the expression
42 evaluates to the value
We've found a trivial expression that evaluates to itself.
However, the actual challenge is to find a list that evaluates to itself.
If we replace the
run 1 in our query with
run 3, we get back two more trivial quines:
#f, the self-evaluating Boolean constants in Racket.
The fourth answer, produced with
run 4, is more interesting:
The first part of this answer is indeed a list that evaluates to itself:
Though McCarthy describes finding quines as a ``difficult mathematical type exercise,'' the quine-generating query is the simplest possible query involving
evalo, in that it contains the minimal number of distinct symbols.
The side-conditions in the second part of the answer tell us that
_.0 can be any legal Racket identifier, other than
prim, or quote.
For example, we could replace
_.0 with the identifier
However, it would not be legal to replace
_.0 with the identifier
list, since this would shadow the
list function used in the body of the
prim as datatype tags, which is why these names cannot be used in expressions. This leaky abstraction could be mitigated by generating unique symbols for these tags. To truly plug up the leak, we would also have to tag all pairs, which would permit the unambiguous use of these tags as values.
We can produce our non-trivial quine using
run 1 rather than a
if we unify
e with the pair
`(,a . ,d), forcing
e to be a list:
We can also use
evalo to generate twines—two distinct expressions
q such that
p evaluates to
q evaluates to
Our query uses the disequality constraint operator
=/= to assert that
q must be different.
The result of the query indicates
p is the expression
q is the expression
where, once again,
_.0 can be any legal Racket identifier, other than
prim, or quote. Importantly,
_.0 must have the same value in both
``Lazy constraints''—such as the disequality constraint
=/= and the
numbero constraint used to construct the
num side-condition—are an important part of constraint logic programming [Jaffar and Lassez 1987] that we take advantage of here. These constraints are critical to the performance of
The abstract representation of a number allowed by the
numbero constraint prevents the quine-generating query from enumerating infinitely many trivial quines.
Disequality constraints provide an important form of negation.
Using unification alone, disequality between terms must be expressed by enumerating all possible combinations of values that differ, making twine generation hopelessly inefficient.
Stop and think:
Write a query that generates a
thrine---that is, three distinct expressions
r such that
p evaluates to
q evaluates to
r evaluates to
Here is the thrine-generating query, which requires a disequality constraint for each pairing of
Chaining additional uses of
evalo, going from quines to twines to thrines, results in the query taking more time and memory.
The thrines-generating query may not find a thrine before running out of memory, unless some of the primitives and forms are removed from the
evalo relation, which reduces the branching factor of the search.
Challenge 3. Many students struggle with the concept of lexical scope versus dynamic scope. To help these students out, generate 100 expressions that evaluate to 42 under lexical scope, and 137 under dynamic scope. To help focus on issues of scope, use a restricted subset of Racket that corresponds to the call-by-value λ-calculus extended with numeric constants.
Stop and think: How would you solve Challenge 3?
We've seen that we can synthesize programs that evaluate to a particular value—such as
(I love you)—using a relational interpreter. Our solution to this challenge evaluates an unknown expression simultaneously in two relational interpreters: one implementing lexical scope and one implementing dynamic scope.
We need only a few language forms to reveal distinctions between the two scoping disciplines: lambda, variable reference, function application, and numeric literals for the return values specified in the challenge.
The evaluation judgement is implemented as the miniKanren relation
eval-expr-lexo. In our queries we'll use the relation
eval-lexo which just calls
eval-expr-lexo with the empty environment.
We point out the interesting features of the miniKanren code corresponding to each inference rule:
numberoconstraint ensures the clause only applies when
expris a number.
lambdaisn't bound in the environment. The recursive
not-in-envorelation implements this restriction. Because miniKanren uses first-order unification, we must use data structural representations of expressions, environments, and closures. Expressions are represented as s-expressions, environments as association lists, and closures as lists containing the symbol
'closure, the code for the procedure, and the environment where the lambda was evaluated.
lookuporelation implements variable lookup in the environment (represented as an association list). The lazy
symboloconstraint asserts that the expression is a symbol representing a lexical variable.
ext-envorelation extends the environment by unifying
`((,x . ,val) . ,cenv).
These semantics don't exactly correspond to a subset of Racket. We only care about the behavior of programs that terminate and don't produce errors. As such, we can cheat a little and get away with it! For example, our interpreter won't evaluate the remaining expressions in an application if the first expression, the operator, doesn't evaluate to a closure. In Racket,
(5 ((lambda (x) (x x)) (lambda (x) (x x)))) loops infinitely, but its evaluation in
eval-lexo terminates with failure. Even better, our optimization makes the evaluation of the application
(5 e), where
e is an unknown expression, terminate with failure as well. Without this optimization, miniKanren would have to generate, and evaluate, each of the infinitely many possible expressions
Our ``cheat'' to optimize application is valid because our interpreter implements a purely functional language. Any successful, terminating evaluation must eventually check that the value of the operator expression is a closure. What might happen if we applied this optimization to an interpreter for a language with exceptions? For a language with user-defined exceptions that can be thrown, but not caught, we might treat throwing an exception as failure. Our optimization would remain correct. However, we might instead write an interpreter for a language where exceptions can be caught, or where we want to report details of the exception rather than fail. Then the interpreter must apply the optimization only when it is guaranteed that evaluation will eventually check that the value of the operator expression is a closure. First-class control operators like
call/cc present a similar challenge.
Creating a variant of the interpreter for dynamic scope requires only a tiny change to the application rule: rather than extending
cenv, the environment from the closure, we extend
env, the environment at the call site. We'll refer to this variant as
expr, asserting that in each interpreter
exprevaluates to the right number.
We can easily ask for many more programs; a
run 100 query finds 100 in 1.7 seconds on an early 2013 MacBook Pro.
Curiously, the query is very slow if the
eval-dyno call is reordered after the
It turns out that this reordered query spends most of its time evaluating non-terminating programs!
Relatively simple programs in the language of
eval-dyno can express non-terminating computations; some of these programs may terminate in
eval-lexo. For example, here's a program that evaluates to 42 under lexical scope but does not terminate under dynamic scope:
Stop and think: Why does the evaluation of this expression never terminate with dynamic scope?
After the outermost function application, the subexpression
is evaluated. The parameter of the first
lambda expression binds
x to the procedure resulting from the second
lambda expression. The body of that procedure,
(x 2), is evaluated while the binding is still present in the dynamic environment, making it a recursive call.
Program synthesis tools generally need to take special care to avoid non-terminating programs, and sometimes employ termination checking to rule them out. How is it that our relational interpreters don't get totally stuck when evaluating a non-terminating program?
Instead of exploring the alternative solutions represented by different
conde clauses in sequence, the miniKanren search interleaves the work. A non-terminating computation in one clause will consume computational effort but cannot entirely halt work on other solutions. Even with an interleaving search strategy, however, we can still get in trouble. If many non-terminating computations stack up, they can starve work on other candidate solutions for resources. In many circumstances our search is clever enough to avoid that possibility, but here it fails.
In the reordered query with
eval-lexo first, the search quickly finds many partial programs that evaluate to 42 in lexical scope. Each of these programs are subsequently evaluated under dynamic scope, and as in the example above, some will never terminate. The implementation of conjunction in miniKanren assigns half of the computational effort to evaluating the first lexical scope solution under dynamic scope, a quarter to the second, an eighth to the third, and so on. That strategy works well when each evaluation quickly succeeds or fails. When those computations fail to terminate, however, later solutions are starved for time.
Why does the query work better with
eval-dyno first? Our
eval-lexo language doesn't provide a form like
letrec for direct recursion, so non-terminating programs require a more complex construction like the Z-combinator (the call-by-value Y-combinator). As a result, the programs that
eval-dyno finds before solving our query happen to terminate in
We've seen the flexibility of the
appendo relation in section 1.1. Is there a way to get the same flexibility, but using the standard Racket definition of the
Stop and think: How would you solve Challenge 4?
One solution to this problem would be a Racket-to-miniKanren compiler, or a macro with the same functionality.
evalo implements enough of Racket to run
append, we can solve this problem directly: we just run the Racket definition of
append inside a query to
To keep the implementation of
evalo simple, we define
letrec instead of
define, and using
if instead of
Naturally we can run
Remember how we used
appendo as a relation in section 1.1:
run* returned a single answer, miniKanren has proven that no other answers exist: only
(a b c) can be prepended to
(d e) to produce
(a b c d e).
appendo relation has three arguments, with the last one representing the result of appending the first two arguments. When evaluating a call to
append in the relational interpreter, the second argument to
evalo serves a similar purpose, representing the return value of the
This means we can use
append as a relation! We pass the logic variable
q as the first argument to
, to escape the
` form wrapping the
It seems to work. Let's try the same query with
run* to see whether evaluating
append in the interpreter can replicate
appendo's ability to prove that only one answer exists.
Alas, we'll be waiting a long time indeed; the
run* query never returns a result. There are two possible causes: either there are infinitely many answers, or there are finitely many but miniKanren can't prove this. If there are many answers, a
run 2 query should terminate with two answers. If there is only the one answer, the
run 2 query should loop forever just like the
Stop and think:
run 2 query return two answers, or will it loop forever?
run 2 actually terminates with these two answers:
lambda doing there?
appendo operates on lists, the relational interpreter operates on expressions. The first argument to
append can be any of an infinite class of expressions that evaluate to the list
(a b c)! And one such expression is
((lambda _.0 '(a b c))).
The fact that we're synthesizing expressions rather than values also explains a subtle change you may have noticed in the first answer: the query finds the quotation expression
'(a b c) rather than the value
(a b c).
Stop and think:
How can we recover the behavior of
run* proves there is only one list
q that when prepended to
(d e) produces
(a b c d e)?
The smallest of changes will do: just add a quote before the unquoted logic variable
q. For each value, there is only one quotation expression that evaluates to that value.
We've found that
append in the relational interpreter has all the power of the
appendo relation implemented in miniKanren.
Even better, the relational behavior of
append and the interpreter's ability to generate expressions combine; we can now query for Racket expressions whose values satisfy a given call to
Let's go even further. Not only can we generate expressions in argument position of the call to
append, we can generate expressions for the implementation of
append. We can even run the (as yet partially unknown)
append function backwards. We can do all of these things in a single query:
Let's discuss how this program is represented as a miniKanren term so that we can explain what is happening.
A term can be either fully instantiated (containing no logic variables), partially instantiated (having some fixed structure, but containing one or more logic variables), or fully uninstantiated (a single logic variable with no associated value).
In this case we've left part of the definition of
append uninstantiated, representing the missing code with the logic variable
The program calls
append twice: once with the output fully instantiated but part of the input unknown, represented by the logic variable
y, and once with fully instantiated input and output.
As the relational interpreter evaluates the first call to
append, it synthesizes assignments of
y such that
(append y '(c d e)) evaluates to
(a b c d e).
The second call selects from those assignments a solution with an
x such that
(append '(f g h) '(i j))) evaluates to
(f g h i j).
The query instantiates
x to the correct code snippet
(car l), and
y to the correct missing argument,
This query simultaneously performs code synthesis and bidirectional evaluation.
At this point, it's natural to wonder how much of
append can be synthesized with this technique. With a straightforward, unoptimized implementation of
evalo, only small parts of
append can be synthesized in a reasonable amount of time. Can we do better? By optimizing a few key portions of
evalo, we can in fact synthesize the entire body of
append from just a few examples of its use! We'll briefly describe the optimizations that enable this degree of synthesis, but these details aren't necessary for an understanding of the rest of the paper.
The most significant optimization we implement orders goals carefully when performing procedure application. We evaluate the operator expression first to determine whether its value is a closure. If it's a closure, we next evaluate operand expressions that are at least partially instantiated, followed by the closure body, and then finally the fully uninstantiated operand expressions.
This order best takes advantage of known information. Evaluating partially instantiated operand expressions is assumed to be straightforward, and doing this first means that the values of corresponding variable references in the closure body do not need to be guessed blindly. Evaluating the closure's body before evaluating uninstantiated operand expressions reduces the number of branching computations between guessing and testing: guesses about the body can be immediately tested against the expected result of the application, whereas uninstantiated operand expressions cannot be immediately tested.
Consider this query:
Within the body of the
let we would first evaluate the
lambda expression to a closure, then
7, followed by the closure body
(cons a d) where
d is bound to
a is still unknown. Only if those succeed would we start guessing terms for
e. In this case the query fails before we make any guesses. Evaluating the closure body produces a pair whose
7, which cannot possibly fit our expected result of
(5 . 6), whose
Because we are implementing a purely functional language, we can perform these reorderings without changing a program's behavior. If evaluating
e could produce side effects, such as mutating the value of
x, we would have to be more careful.
Here are some other optimizations that help a great deal:
condewhen it seems likely that more than one clause will succeed. Deferred goals are queued up to be retried later, after all informed guesses have been made, with the hope of having learned more useful information by the time we retry.
#f(as opposed to other ``true'' Racket values, such as `
evalo, favoring expression types that are more common.
Let's get back to synthesizing
append. We'll start by providing a couple of example calls.
This definition doesn't look right. Although it supports our specific query, it isn't general enough to append arbitrary lists. What went wrong?
Because the search simply looks for any definition that works, a simple way to satisfy the constraints is to simply return our example output verbatim. This is why we see our expected second result embedded directly in the definition. How do we get around this?
One solution is to prohibit the definition from mentioning our example data directly. The constraint
(absento t1 t2) prevents term
t1 from appearing anywhere in term
The miniKanren implementation we use for this pearl restricts
t1 to be a literal symbol or number rather than a term. Other implementations can handle the general version of the constraint.
We can use
absento to prevent numbers from our examples from appearing in the definition we're synthesizing. We provide an
absento constraint for each number we mention.
Though our example result is no longer embedded in the definition, the definition still is not general enough to append arbitrary lists. It handles exactly the cases of the empty and single element lists. The definition can't handle larger lists because it's not recursive!
Stop and think:
How can we synthesize a more general version of
Adding a couple more usage examples helps us find a more general definition.
This query responds with a fully general definition of
append, implemented as a recursive procedure.
Why isn't the answer, once again, specific to exactly the examples provided?
Given enough examples, the simplicity of the recursive solution makes it easier to find than the overly-specific solution!
Though we can synthesize
append from scratch, what if a standard library were available? Would we be able to define
append in terms of existing definitions? Let's try providing
fold-right available in the environment,
append is quickly defined in terms of it!
While our optimized relational interpreter can both synthesize recursive functions and take advantage of higher-order functions, it struggles with larger programs. The search space grows exponentially with the size of the smallest correct definition of the program to be synthesized. Our current implementation can't synthesize complete recursive programs much larger than
Test-driven development asks us to write our test cases before we write our programs, so that we can receive immediate feedback on the code we've written.
Say we're writing a program
remove that removes all occurrences of a symbol
x that occur as top-level elements in a list
Our first job is to describe
remove with a set of tests:
Let's pretend that after thinking over the cases that our function must handle, we have written the program skeleton below. Here
C represent ``holes,'' which are expressions we've yet to fill in.
We've already made a mistake! Though the program isn't complete, there is no way to fill in the holes to form a definition of
remove that matches our tests.
Sadly, test-driven development in a typical programming environment doesn't help us: we can't run tests on an incomplete program.
Challenge 5. How can we determine that we've already made a mistake?
Stop and think: How would you solve Challenge 5?
Using our relational interpreter we actually can run the test suite with our incomplete program, by representing any holes with (unquoted) logic variables.
Our query responds almost immediately with the empty answer set. Because miniKanren uses a complete search that is guaranteed to find an answer if one exists, the empty answer set means there is no correct way to synthesize expressions to fill in the holes. We've proved that an incomplete program is inconsistent with its test cases! We call such a proof a refutation of the program.
It's worth pausing for a moment to discuss how refutation works, and how it can be so fast. We mentioned that the underlying system runs a complete search, and it may be tempting to assume the search uses blind, brute force. But this search isn't blind. By working top-down, connecting expected result values to the outer expression shapes being considered, the search can prune unpromising branches very early on. It's only after the outer expression shape is shown to be compatible with the desired result that sub-expressions are considered.
Stop and think: We know our program is wrong. How do we figure out where it went wrong?
By attempting to satisfy tests individually, we can get a hint at the problem.
In general, all test subsets may need to be checked since a subset may fail even if all its tests are individually satisfiable.
In this case, the only unsatisfiable test is:
(remove 'foo '(foo)) => ()
This failing test suggests
remove does not work correctly when the first list element is the item to be removed. If we relax a portion of our program by forming more or larger expression holes, looking for the moment our test becomes satisfiable, we can narrow down the problem. In this case, we are able to satisfy our query by relaxing the right-hand-side expression of our equality clause.
The test now passes, suggesting it was a mistake to try
(cons ,A ,B) in the equality clause;
the clause should produce the empty list, not a pair. With this insight, we can continue implementing
Let's try the related problem of removing a symbol from anywhere in an s-expression, even inside deeply nested lists. Our test suite should return slightly different answers:
evalo to run our program as we write it, we notice a refutation:
Checking each test individually, we learn that our last test case is unsatisfiable.
Relaxing expressions that contain logic variables suggests a problem with the right-hand-side of our
else clause. Using
(cons (car ls) ,B) causes the last test to fail, while
(cons ,C ,B) does not.
Stop and think:
What is wrong with this partial definition of
The problematic right-hand-side
(cons (car ls) ,B) always constructs a pair containing the
ls as the first element, even if the
car contains the symbol we wish to remove. In other words, there is never a recursive call on the
ls. If we look at the list passed to
remove in the failing test case, we notice that its first element is itself a list containing the item we'd like to remove. If the first element of
ls is a list, we need to process it before including it in our result. Adding a
[(pair? (car ls)) ...] clause after the
[(null? ls) ...] clause allows our test to be satisfied.
The debugging steps in this section are tedious to use by hand, but they could provide rich interactive feedback if automatically executed by an interactive development environment. It is interesting that we're able to refute programs without needing the support of a type system (though it is possible to combine a type inferencer, written as a relation, with the interpreter).
Here's a tiny proof-checker for propositional logic, written in Racket:
A proof is represented as a list
(conclusion assumptions rule-name sub-proofs) where:
conclusionis a miniKanren term;
assumptionsis a list of terms;
rule-nameis a symbol;
sub-proofsis a list of zero or more proofs.
This proof-checker has only three rules: assumptions, such as the proposition
A or the implication
A => B;
modus ponens, which derives
A => B; and conditional proof, which derives
A => B if we can derive
B after assuming
For example, we can check this proof that
C holds, assuming
A => B, and
B => C:
Write a tiny theorem prover for the logic of the proof checker,
Stop and think: How would you solve Challenge 6?
How might we write our tiny prover? A simple (though inefficient) algorithm might use a goal-directed, breadth-first search, which is complete over infinite trees. But miniKanren already implements a complete, goal directed search! What happens if we just run the proof checker in the relational interpreter we've used for the previous challenges?
Just as we treated the function
append as a relation by running it inside the relational interpreter, we can treat the
proof? function as a relation that connects assumptions, proofs, and conclusions. Here we query the
proof? function to generate proofs of a theorem, given a set of assumptions:
Sure enough, miniKanren finds the proof we checked previously.
By instantiating different parts of the query, we can use
proof? in different ways. We can partially instantiate the proof and ask the query to fill in the missing parts, or provide assumptions and ask the query to search for all valid conclusions with corresponding proofs. This method is powerful enough to prove interesting theorems, such as transitivity of implication:
We can also prove commutativity of conjunction where conjunction is encoded with implication.
Generate a quine that uses
quasiquote instead of
The quines using
cons generated in section 2 are a little verbose.
Surely a quine can be more concise! Perhaps a quine could use
quasiquote to construct the expression.
evalo relational interpreter doesn't include the
We could extend the interpreter to add it, but is there another way?
Stop and think: How would you solve Challenge 7 without modifying the relational interpreter?
quasiquote can't be implemented as a function, since it does not directly evaluate its argument, but rather interprets its argument to find
unquote forms. Thus we can't simply define
quasiquote in the interpreter with
We need to go more meta! We'll use a meta-circular evaluator running inside the relational interpreter.
Let's write an interpreter for a small subset of Racket that includes
quasiquote, in the subset of Racket supported by the relational interpreter. Our query asserts that applying the nested interpreter to the quoted s-expression of the quine should return that same s-expression.
The unfortunate representation of the pattern for matching unquote expressions results from an oddity of Racket's quasiquotation support. In other Scheme implementations that line can be written [`(,`unquote ,exp) (eval exp)].
To keep the meta-circular evaluator concise, we don't implement nested quasiquotation.
Instead, whenever the meta-circular evaluator encounters a nested
quasiquote it produces an error by illegally attempting to call a symbol.
The outer, relational interpreter handles such an error as failure, so the combination of interpreters won't generate programs using nested quasiquotation.
We use the same technique to fail upon variable lookup in the empty environment, which we represent with the symbol
letrec is necessary because our standard version of
evalo does not support mutually-recursive definitions, though the optimized
evalo used earlier does support mutual recursion.
Insight: The outer, relational interpreter must use a first-order representation of environments and closures, as miniKanren only supports first-order terms. However, our Racket-in-Racket can use higher order representations thanks to its meta-circular use of that outer implementation. Environments are represented as functions that accept a symbol and return the value to which it is bound. When extending an environment, we wrap it in a new function that first checks a variable against the new binding. If the variable doesn't match, the new function delegates to the original environment by calling the function representing it.
The meta-circular evaluator uses pattern matching to recognize each form of expression. In the relational setting it's even more important than usual to specify programs unambiguously. Consider what might happen if rather than using
match, we were to use this conditional expression to recognize the syntax for a
When evaluating an instantiated expression with well-formed syntax, this condition should work. But when generating expressions, miniKanren can invent any expression that satisfies the condition. It might generate the identity function like this:
_.1 indicate that any term whatsoever may appear in those positions. Although the values in those positions don't affect the behavior of the expression in this evaluator, they both have to be the empty list in order for this function to behave the way we expect in Racket.
To ensure the expression assumes the desired syntax, the condition would need two more checks to fully specify the form:
(null? (cdadr expr)) and
(null? (cdddr expr)). In contrast, the natural way of writing the evaluator with
match fully specifies the format.
quasiquote quine-generating query returns the answer:
Finding quines through the meta-circular evaluator is much slower than the approach we used in section 2—this query takes 5 minutes with our current hardware—but it allows us to generate programs using a new language form without modifying our relational interpreter.
While solving these programming challenges we've shown that a relational interpreter can be used to solve myriad problems: example-based synthesis of recursive programs; refutation and bug-finding for incomplete programs; and generating program inputs with desired properties, using both a single interpreter and multiple interpreters with different semantics. Our success in solving the most difficult of these challenges—such as synthesis of recursive programs—is due to optimizations that use convenient properties of purely functional programming languages. The freedom to re-order evaluation, thanks to lack of side effects, is particularly important.
We also found that relational interpreters can confer relational capabilities on higher-order functional languages, even through multiple levels of interpretation. This pearl is a concrete demonstration that a functional program can have more (and surprising!) behaviors than just that for which it was originally written. In some sense every experienced functional programmer knows this to be true—for example, we write semantics as relations. The relational interpreter approach provides a technique for exercising the fully general relational meaning of a functional program.
This material is partially based on research sponsored by DARPA under agreement number AFRL FA8750-15-2-0092 and by NSF under CAREER grant 1350344. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon.
Stuart Halloway pointed out that the relational interpreter should be able to generate quines; thank you, Stuart! We thank Larry Moss and the members of the Indiana University logic seminar for their encouragement, and for challenging us to generate twines and thrines. Eric Holk and Dan Friedman worked with co-author Byrd on quine generation in miniKanren, which led to a paper (PDF) from which Challenge 2 was taken.
We thank Dan Friedman, Nada Amin, Jason Hemann, Rob Zinkov, Tom Gilray, Lisa Zhang, Jon Smock, Dann Toliver, Annie Cherkaev, Guannan Wei, Dmitri Boulytchev, Evan Donahue, Steve Gilardi, and the anonymous reviewers for their many helpful comments.