[pypy-svn] r41537 - pypy/dist/pypy/doc
auc at codespeak.net
auc at codespeak.net
Tue Mar 27 19:30:56 CEST 2007
Author: auc
Date: Tue Mar 27 19:30:55 2007
New Revision: 41537
Modified:
pypy/dist/pypy/doc/howto-logicobjspace.txt
Log:
big update
Modified: pypy/dist/pypy/doc/howto-logicobjspace.txt
==============================================================================
--- pypy/dist/pypy/doc/howto-logicobjspace.txt (original)
+++ pypy/dist/pypy/doc/howto-logicobjspace.txt Tue Mar 27 19:30:55 2007
@@ -5,11 +5,10 @@
Outline
=======
-This document gives some (outdated wrt what sits in the repository, a
-much better document to read would be the `EU Interim Report`_)
-information about the content and usage of an extension of PyPy known
-as the Logic Objectspace (LO). The LO, when finished, will provide
-additional builtins that will allow to write:
+This document gives some information about the content and usage of an
+extension of PyPy known as the Logic Objectspace (LO), and also about
+the constraint programming library that ships with PyPy. The LO, when
+finished, will provide additional builtins that will allow to write:
* concurrent programs based on coroutines scheduled by dataflow logic
variables,
@@ -20,25 +19,32 @@
* new search "engines" to help solve logic/constraint programs.
-The 0.9 preview comes without logic programming; the constraint solver
-is only lightly tested, is not equipped with some specialized but
-important propagators for linear relations on numeric variables, and
-*might* support concurrency - but that would be an accident; the
-dataflow scheduling of coroutines is known to fail in at least one
-basic and important case.
-
-In this document, we skim over these topics, hoping to give enough
-information and examples for an uninformed user to understand what is
-going on and how to use the provided functionality.
+Currently, the `integrated concurrent logic and constraint
+programming` part is, unfortunately, quite unfinished. It will take
+some effort, time and knowledge of PyPy internals to finish it. We
+provide however a full-blown constraint-solving infrastructure that
+can be used (and extended) out of the box.
-To fire up a working PyPy with the LO, please type::
+To fire up a working standard PyPy with the the constraint library,
+please type::
-/root-of-pypy-dist/pypy/bin/py.py -o logic --withmod-_stackless
+ /root-of-pypy-dist/pypy/bin/py.py --withmod-_cslib
+To fire up a working PyPy with the LO (including the constraint
+solving library), please type::
+
+ /root-of-pypy-dist/pypy/bin/py.py -o logic
+
+More information is available in the `EU Interim Report`_, especially
+with respect to the (unfinished) integrated framework for constraint
+and logic programming.
Logic Variables and Dataflow Synchronisation of Coroutines
==========================================================
+This section peruses the LO, so you should try the examples with a
+logic build or the `-o logic` argument to `py.py`.
+
Logic Variables
+++++++++++++++
@@ -68,13 +74,11 @@
The single-assignment property is easily checked::
- bind(X, 'hello') # would raise a FailureException
+ bind(X, 'hello') # would raise a RebindingError
bind(X, 42) # is admitted (it is a no-op)
-In the current state of the LO, a generic Exception will be raised.
It is quite obvious from this that logic variables are really objects
-acting as boxes for python values. No syntactic extension to Python is
-provided yet to lessen this inconvenience.
+acting as boxes for python values.
The bind operator is low-level. The more general operation that binds
a logic variable is known as "unification". Unify is an operator that
@@ -83,17 +87,22 @@
important twist: unify mutates the state of the involved logic
variables.
-Unifying structures devoid of logic variables, like::
+Unify is thus defined as follows (it is symmetric):
- unify([1, 2], [1, 2])
- unify(42, 43)
+.. raw:: latex
-is equivalent to an assertion about their equality, the difference
-being that a FailureException will be raised instead of an
-AssertionError, would the assertion be violated::
+ \begin{center}
+ \begin{tabular}{|l|l|l|} \hline
+ \textbf{Unify} & \textbf{value} & \textbf{unbound var} \\ \hline
+ \textbf{value} & equal? & bind \\ \hline
+ \textbf{unbound var} & bind & alias \\ \hline
+ \end{tabular}
+ \end{center}
- assert [1, 2] == [1, 2]
- assert 42 == 43
+Unifying structures devoid of logic variables, like::
+
+ unify([1, 2], [1, 2])
+ unify(42, 43) # raises UnificationError
A basic example involving logic variables embedded into dictionaries::
@@ -102,14 +111,6 @@
{'a': Z, 'b': W})
assert Z == W == 42
-Unifying one unbound variable with some value (a) means assigning the
-value to the variable (which then satisfies equality), unifying two
-unbound variables (b) aliases them (they are constrained to reference
-the same -future- value).
-
-Assignment or aliasing of variables is provided underneath by the
-'bind' operator.
-
An example involving custom data types::
class Foo(object):
@@ -183,28 +184,32 @@
Wait and wait_needed allow to write efficient lazy evaluating code.
-Using the "uthread" builtin (which spawns a coroutine and applies the
+Using the "stacklet" builtin (which spawns a coroutine and applies the
2..n args to its first arg), here is how to implement a
producer/consumer scheme::
- def generate(n, limit):
+ from cclp import stacklet
+
+ def generate(n, limit, R):
if n < limit:
- return (n, generate(n + 1, limit))
- return None
+ Tail = newvar()
+ unify(R, (n, Tail))
+ return generate(n + 1, limit, Tail)
+ bind(R, None)
+ return
- def sum(L, a):
+ def sum(L, a, R):
Head, Tail = newvar(), newvar()
unify(L, (Head, Tail))
if Tail != None:
- return sum(Tail, Head + a)
- return a + Head
+ return sum(Tail, Head + a, R)
+ bind(R, a + Head)
+ return
X = newvar()
S = newvar()
-
- unify(S, uthread(sum, X, 0))
- unify(X, uthread(generate, 0, 10))
-
+ stacklet(sum, X, 0, S)
+ stacklet(generate, 0, 10, X)
assert S == 45
Note that this eagerly generates all elements before the first of them
@@ -219,55 +224,23 @@
bind(L, (n, Tail))
lgenerate(n+1, Tail)
- def lsum(L, a, limit):
+ def lsum(L, a, limit, R):
"""this summer controls the generator"""
if limit > 0:
Head, Tail = newvar(), newvar()
- wait(L)
+ wait(L) # awakes those waiting by need on L
unify(L, (Head, Tail))
- return lsum(Tail, a+Head, limit-1)
+ return lsum(Tail, a+Head, limit-1, R)
else:
- return a
+ bind(R, a)
Y = newvar()
T = newvar()
- uthread(lgenerate, 0, Y)
- unify(T, uthread(lsum, Y, 0, 10))
-
- wait(T)
+ stacklet(lgenerate, 0, Y)
+ stacklet(lsum, Y, 0, 10, T)
assert T == 45
-Please note that in the current LO, we deal with coroutines, not
-threads (thus we can't rely on preemptive scheduling to lessen the
-problem with the eager consumer/producer program). Also nested
-coroutines don't schedule properly yet. This impacts the ability to
-write a simple program like the following::
-
- def sleep(X, Barrier):
- wait(X)
- bind(Barrier, True)
-
- def wait_two(X, Y):
- Barrier = newvar()
- uthread(sleep, X, Barrier)
- uthread(sleep, Y, Barrier)
- wait(Barrier)
- if is_free(Y):
- return 1
- return 2
-
- X, Y = newvar(), newvar()
- o = uthread(wait_two, X, Y)
- unify(X, Y)
- unify(Y, 42)
- assert X == Y == 42
- assert o == 2
-
-Finally, it must be noted that the bind/unify and wait pair of
-operations are quite similar to the asynchronous send and receive
-primitives commonly used for inter-process communication.
-
The operators table
-------------------
@@ -281,270 +254,95 @@
Coroutine spawning
- uthread/n | 1 <= n
- callable, opt args. -> logic var.
-
+ stacklet/n
+ callable, (n-1) optional args -> None
Constraint Programming
======================
-The LO comes with a flexible, extensible constraint solver
-engine. While regular search strategies such as depth-first or
-breadth-first search are provided, you can write better, specialized
-strategies (an example would be best-search). We therein describe how
-to use the solver to specify and get the solutions of a constraint
-satisfaction problem, and then highlight how to extend the solver with
-new strategies.
-
-Using the constraint engine
-+++++++++++++++++++++++++++
+PyPy comes with a flexible, extensible constraint solver engine based
+on the CPython Logilab constraint package (and we paid attention to
+API compatibility). We therein describe how to use the solver to
+specify and get the solutions of a constraint satisfaction problem.
Specification of a problem
---------------------------
+++++++++++++++++++++++++++
A constraint satisfaction problem is defined by a triple (X, D, C)
where X is a set of finite domain variables, D the set of domains
associated with the variables in X, and C the set of constraints, or
relations, that bind together the variables of X.
-Note that the constraint variables are NOT logic variables. Not yet
-anyway.
-
So we basically need a way to declare variables, their domains and
-relations; and something to hold these together. The later is what we
-call a "computation space". The notion of computation space is broad
-enough to encompass constraint and logic programming, but we use it
-there only as a box that holds the elements of our constraint
-satisfaction problem. Note that it is completely unrelated to the
-notion of object space (as in the logic object space).
-
-A problem is a one-argument procedure defined as follows::
-
- def simple_problem(cs):
- cs.var('x', FiniteDomain(['spam', 'egg', 'ham']))
- cs.var('y', FiniteDomain([3, 4, 5]))
-
-This snippet defines a couple of variables and their domains, on the
-'cs' argument which is indeed a computation space. Note that we didn't
-take a reference of the created variables. We can query the space to
-get these back if needed, and then complete the definition of our
-problem. Our problem, continued::
-
- ... x = cs.find_var('x')
- y = cs.find_var('y')
- cs.tell(make_expression([x,y], 'len(x) == y'))
-
- return x, y
-
-We must be careful to return the set of variables whose candidate
-values we are interested in. The rest should be sufficiently
-self-describing...
-
-Getting solutions
------------------
-
-Now to get and print solutions out of this, we must::
-
- import solver
- cs = newspace()
- cs.define_problem(simple_problem)
-
- for sol in solver.solve(cs):
- print sol
-
-The builtin solve function returns a generator. You will note with
-pleasure how slow the search can be on a solver running on a Python
-interpreter written in Python, the later running on top of
-cpython... It is expected that the compiled version of PyPy + LO will
-provide decent performance.
-
-Table of Operators
-------------------
-
-Note that below, "variable/expression designators" really are strings.
-
-Space creation
-
- newspace/0
-
-Finite domain creation
-
- FiniteDomain/1
- list of any -> FiniteDomain
-
-Expressions
-
- make_expression/2
- list of var. designators, expression designator -> Expression
-
- AllDistinct/1
- list of var. designators -> Expression
-
-Space methods
-
- var/2
- var. designator, FiniteDomain -> constraint variable instance
-
- find_var/1
- var. designator -> constraint variable instance
-
- tell/1
- Expression -> None
-
- define_problem/1
- procedure (space -> tuple of constraint variables) -> None
-
-Extending the search engine
-+++++++++++++++++++++++++++
-
-Writing a solver
-----------------
-
-Here we show how the additional builtin primitives allow you to write,
-in pure Python, a very basic solver that will search depth-first and
-return the first found solution.
-
-As we've seen, a CSP is encapsulated into a so-called "computation
-space". The space object has additional methods that allow the solver
-implementor to drive the search. First, let us see some code driving a
-binary depth-first search::
-
- 1 def first_solution_dfs(space):
- 2 status = space.ask()
- 3 if status == 0:
- 4 return None
- 5 elif status == 1:
- 6 return space.merge()
- 7 else:
- 8 new_space = space.clone()
- 9 space.commit(1)
- 10 outcome = first_solution_dfs(space)
- 11 if outcome is None:
- 13 new_space.commit(2)
- 14 outcome = first_solution_dfs(new_space)
- 15 return outcome
-
-This recursive solver takes a space as argument, and returns the first
-solution or None. Let us examine it piece by piece and discover the
-basics of the solver protocol.
-
-The first thing to do is "asking" the space about its status. This may
-force the "inside" of the space to check that the values of the
-domains are compatibles with the constraints. Every inconsistent value
-is removed from the variable domains. This phase is called "constraint
-propagation". It is crucial because it prunes as much as possible of
-the search space. Then, the call to ask returns a positive integer
-value which we call the space status; at this point, all (possibly
-concurrent) computations happening inside the space are terminated.
-
-Depending on the status value, either:
-
-* the space is failed (status == 0), which means that there is no
- combination of values of the finite domains that can satisfy the
- constraints,
-
-* one solution has been found (status == 1): there is exactly one
- valuation of the variables that satisfy the constraints,
-
-* several branches of the search space can be taken (status represents
- the exact number of available alternatives, or branches).
-
-Now, we have written this toy solver as if there could be a maximum of
-two alternatives. This assumption holds for the simple_problem we
-defined above, where a binary "distributor" (see below for an
-explanation of this term) has been chosen automatically for us, but
-not in the general case. See the sources for a more general-purpose
-`solver`_ and more involved `sample problems`_ (currently, probably
-only conference_scheduling is up to date with the current API).
-
-In line 8, we take a clone of the space; nothing is shared between
-space and newspace (the clone). We now have two identical versions of
-the space that we got as parameter. This will allow us to explore the
-two alternatives. This step is done, line 9 and 13, with the call to
-commit, each time with a different integer value representing the
-branch to be taken. The rest should be sufficiently self-describing.
-
-This shows the two important space methods used by a search engine:
-ask, which waits for the stability of the space and informs the solver
-of its status, and commit, which tells a space which road to take in
-case of a fork.
-
-Using distributors
-------------------
-
-Now, earlier, we talked of a "distributor": it is a program running in
-a computation space. It could be anything, and in fact, in the final
-version of the LO, it will be any Python program, augmented with calls
-to non-deterministic choice points. Each time a program embedded in a
-computation space reaches such a point, it blocks until some Deus ex
-machina makes the choice for him. Only a solver can be responsible for
-the actual choice (that is the reason for the name "non
-deterministic": the decision does not belong to the embedded program,
-only to the solver that drives it).
-
-In the case of a CSP, the distributor is a simple piece of code, which
-works only after the propagation phase has reached a fixpoint. Its
-policy will determine the fanout, or branching factor, of the current
-computation space (or node in the abstract search space).
-
-Here are two examples of distribution strategies:
-
-* take the variable with the biggest domain, and remove exactly one
- value from its domain; thus we always get two branches: one with the
- value removed, the other with only this value remaining,
-
-* take a variable with a small domain, and keep only one value in the
- domain for each branch (in other words, we "instantiate" the
- variable); this makes for a branching factor equal to the size of
- the domain of the variable.
-
-There are a great many ways to distribute... Some of them perform
-better, depending on the characteristics of the problem to be
-solved. But there is no absolutely better distribution strategy. Note
-that the second strategy given as example there is what is used (and
-hard-wired) in the MAC algorithm.
-
-Currently in the LO we have two builtin distributors:
-
-* NaiveDistributor, which distributes domains by splitting the
- smallest domain in 2 new domains; the first new domain has a size of
- one, and the second has all the other values,
-
-* SplitDistributor, which distributes domains by splitting the
- smallest domain in N equal parts (or as equal as possible). If N is
- 0, then the smallest domain is split in domains of size 1; a special
- case of this, DichotomyDistributor, for which N == 2, is also
- provided and is the default one.
-
-To explicitly specify a distributor for a constraint problem, you
-need to say, in the procedure that defines the problem::
-
- cs.set_distributor(NaiveDistributor())
-
-It is not possible currently to write distributors in pure Python;
-this is scheduled for PyPy version 1.
-
-Remaining space operators
--------------------------
-
-For solver writers
-
- ask/0
- nothing -> a positive integer i
-
- commit/1
- integer in [1, i] -> None
-
- merge/0
- nothing -> list of values (solution)
-
-For distributor writers
-
- choose
-
-
-.. _`solver`: http://codespeak.net/svn/pypy/dist/pypy/lib/constraint/solver.py
-.. _`sample problems`: http://codespeak.net/svn/pypy/dist/pypy/objspace/test/problem.py
-.. _`Oz programming language`: http://www.mozart-oz.org
+relations; and something to hold these together.
+
+Let's have a look at a reasonnably simple example of a constraint
+program::
+
+ from cslib import *
+
+ variables = ('c01','c02','c03','c04','c05',
+ 'c06','c07','c08','c09','c10')
+ values = [(room,slot)
+ for room in ('room A','room B','room C')
+ for slot in ('day 1 AM','day 1 PM',
+ 'day 2 AM','day 2 PM')]
+ domains = {}
+
+ # let us associate the variables to their domains
+ for v in variables:
+ domains[v]=fd.FiniteDomain(values)
+
+ # let us define relations/constraints on the variables
+ constraints = []
+
+ # Internet access is in room C only
+ for conf in ('c03','c04','c05','c06'):
+ constraints.append(fd.make_expression((conf,),
+ "%s[0] == 'room C'"%conf))
+
+ # Speakers only available on day 1
+ for conf in ('c01','c05','c10'):
+ constraints.append(fd.make_expression((conf,),
+ "%s[1].startswith('day 1')"%conf))
+ # Speakers only available on day 2
+ for conf in ('c02','c03','c04','c09'):
+ constraints.append(fd.make_expression((conf,),
+ "%s[1].startswith('day 2')"%conf))
+
+ # try to satisfy people willing to attend several conferences
+ groups = (('c01','c02','c03','c10'),
+ ('c02','c06','c08','c09'),
+ ('c03','c05','c06','c07'),
+ ('c01','c03','c07','c08'))
+ for g in groups:
+ for conf1 in g:
+ for conf2 in g:
+ if conf2 > conf1:
+ constraints.append(fd.make_expression((conf1,conf2),
+ '%s[1] != %s[1]'%\
+ (conf1,conf2)))
+
+ constraints.append(fd.AllDistinct(variables))
+
+ # now, give the triple (X, D, C) to a repository object
+ r = Repository(variables,domains,constraints)
+
+ # that we can give to one solver of our choice
+ # (there, it is the default depth-first search,
+ # find-all solutions solver)
+
+ solutions = Solver().solve(r, 0)
+ assert len(solutions) == 64
+
+Extending the solver machinery
+++++++++++++++++++++++++++++++
+
+The core of the solving system is written in pure RPython and resides
+in the rlib/cslib library. It should be quite easy to subclass the
+provided elements to get specialized, special-case optimized
+variants. On top of this library is built a PyPy module that exports
+up to application level the low-level engine functionality.
+
.. _`EU Interim Report`: http://codespeak.net/pypy/extradoc/eu-report/D09.1_Constraint_Solving_and_Semantic_Web-interim-2007-02-28.pdf
More information about the Pypy-commit
mailing list