derivation: sick Python trick of the week

Robert Brewer fumanchu at amor.org
Sun Feb 22 19:09:11 EST 2004


WARNING:

The following is NOT all Pythonic. However, it is Python. This is just
fun--nothing I'm going to put into production. I don't have any
questions or problems--just thought I'd share.


BACKGROUND:

I've been playing around for a couple of days with parsing and
derivation ("unparsing"). My goal was and is to take snippets of Python
code and turn them into "equivalent" SQL statements. For example, the
Python snippet:

x.Group > 3 or x.Name == 'fumanchu'

...should equate to an SQL WHERE clause like:

Group > 3 or Name = 'fumanchu'

I tore into the compiler package, and hacked together a solution which
does just that. One invokes such a task with code like:

sqlstmt = derive.ADODeriver().evaluate("x.Group > 3 or x.Name ==
'fumanchu'")

However, it's parsing the output of a compiler.Transformer, which itself
is parsing an AST (lots of nested tuples), so it's not very quick. 1000
repetitions of the above example take over 1/2 second to run on my
laptop.


IT'S ALIVE!!!

In a fit of mad-scientist genius (get out the pitchforks and torches), I
wondered if it might be faster to let Python do *all* the parsing, and
just watch it work and take notes. That's what the code below does. The
sick and twisted part is how one invokes this technique:

>>> import sick_derive
>>> x = sick_derive.Expression()
>>> (x.a == 3) & ((x.b > 1) | (x.b < -10))
>>> x
['a', 3, <built-in function eq>, 'b', 1, <built-in function gt>, 'b',
-10, <built-in function lt>, <built-in function or_>, <built-in function
and_>]
>>> sick_derive.Deriver().derive(x)
'((a == 3) and ((b > 1) or (b < -10)))'

Yes, in line two we run comparisons and boolean operations on
non-existent attributes of x, and discard the results! Sick. However, we
were keeping track (as the repr of x shows on line 3); we built a
postfix list of the comparisons we performed. When we call
Deriver().derive(x), the Deriver traverses that list and rebuilds the
Python code. I made a similar ADODeriver which builds SQL code (too
complex to include here; it also used a different Adapter for the
object-to-DB mapper).

I had to replace boolean 'and' and 'or' with binary calls in order to
override them, and 'not' with 'neg'. That makes it even sicker. And it's
*far* from obvious that 'x.a' should reduce to just 'a'.

But it's about 7 times as fast as the first solution. ;)


So for a host of reasons, don't ever use this. I won't. But it was
interesting.


Robert Brewer
MIS
fumanchu at amor.org


---- sick_derive.py ----

import operator


def containedby(a, b):
    """Return the outcome of the test a in b. Note the operand order."""
    return a in b

def startswith(a, b):
    """Return True if string starts with the prefix, otherwise return
False."""
    return a.startswith(b)

def endswith(a, b):
    """Return True if the string ends with the suffix, otherwise return
False."""
    return a.endswith(b)


class Operand(object):
    """Push atoms onto a postfix expression stack."""
    def __init__(self, expr, name=''):
        self.expr = expr
        self.name = name
    
    def __neg__(self):
        self.expr.append(operator.not_)
        return Operand(self.expr)
    
    def __lt__(self, other):
        self.expr.extend([self.name, other, operator.lt])
        return Operand(self.expr)
    
    def __le__(self, other):
        self.expr.extend([self.name, other, operator.le])
        return Operand(self.expr)
    
    def __eq__(self, other):
        if self.name:
            self.expr.extend([self.name, other, operator.eq])
        else:
            self.expr.extend([other, operator.eq])
        return Operand(self.expr)
    
    def __ne__(self, other):
        self.expr.extend([self.name, other, operator.ne])
        return Operand(self.expr)
    
    def __ge__(self, other):
        self.expr.extend([self.name, other, operator.ge])
        return Operand(self.expr)
    
    def __gt__(self, other):
        self.expr.extend([self.name, other, operator.gt])
        return Operand(self.expr)
    
    def __and__(self, other):
        self.expr.append(operator.and_)
        return Operand(self.expr)
    
    def __or__(self, other):
        self.expr.append(operator.or_)
        return Operand(self.expr)
    
    def __contains__(self, other):
        self.expr.extend([self.name, other, operator.contains])
        return Operand(self.expr)
    
    def contains(self, other):
        self.expr.extend([self.name, other, operator.contains])
        return Operand(self.expr)
    
    def containedby(self, other):
        self.expr.extend([self.name, other, containedby])
        return Operand(self.expr)
    
    def startswith(self, other):
        self.expr.extend([self.name, other, startswith])
        return Operand(self.expr)
    
    def endswith(self, other):
        self.expr.extend([self.name, other, endswith])
        return Operand(self.expr)


class Expression(list):
    """A postfix recorder and evaluator for operations on arbitrary
objects."""
    
    unaries = (operator.not_, )
    binaries = (operator.and_,
                operator.or_,
                operator.lt,
                operator.le,
                operator.eq,
                operator.ne,
                operator.gt,
                operator.ge,
                operator.contains,
                containedby,
                startswith,
                endswith,
                )
    
    def __getattr__(self, name):
        return Operand(self, name)
    
    def evaluate(self, obj, ifEmpty=True):
        """Evaluate an object against self.
        Names will be looked up in the object's attribute dictionary.
        """
        stack = self[:]
        if not stack:
            return ifEmpty
        
        def operate():
            op = stack.pop()
            if op in self.unaries:
                a = operate()
                return op(a)
            elif op in self.binaries:
                b = operate()
                a = operate()
                if a not in (True, False):
                    a = getattr(obj, a)
                return op(a, b)
            else:
                return op
        return operate()


class Adapter(object):
    """Transform values according to their type."""
    
    def __init__(self):
        self.default_processor = repr
        self.processors = {}
    
    def process(self, value):
        try:
            xform = self.processors[type(value)]
        except KeyError:
            xform = self.default_processor
        return xform(value)


class Deriver(object):
    """Derive an Expression according to a grammar."""
    
    def __init__(self, adapter=Adapter()):
        f = adapter.process
        self.unaries = {operator.not_: lambda x: "(not %s)" % x}
        self.binaries = {operator.and_: lambda x, y: "(%s and %s)" % (x,
y),
                         operator.or_: lambda x, y: "(%s or %s)" % (x,
y),
                         operator.lt: lambda x, y: "(%s < %s)" % (x,
f(y)),
                         operator.le: lambda x, y: "(%s <= %s)" % (x,
f(y)),
                         operator.eq: lambda x, y: "(%s == %s)" % (x,
f(y)),
                         operator.ne: lambda x, y: "(%s != %s)" % (x,
f(y)),
                         operator.gt: lambda x, y: "(%s > %s)" % (x,
f(y)),
                         operator.ge: lambda x, y: "(%s >= %s)" % (x,
f(y)),
                         operator.contains: lambda x, y: "(%s in %s)" %
(f(y), x),
                         containedby: lambda x, y: "(%s in %s)" % (x,
f(y)),
                         startswith: lambda x, y: "(%s.startswith(%s))"
% (x, f(y)),
                         endswith: lambda x, y: "(%s.endswith(%s))" %
(x, f(y)),
                         }
        self.ifEmpty = ''
    
    def derive(self, expr):
        try:
            stack = expr[:]
        except TypeError, x:
            x.args += (type(expr), )
            raise x
        
        if not stack:
            return self.ifEmpty
        
        def operate():
            op = stack.pop()
            if op in self.unaries:
                a = operate()
                return self.unaries[op](a)
            elif op in self.binaries:
                b = operate()
                a = operate()
                return self.binaries[op](a, b)
            else:
                return op
        return operate()




More information about the Python-list mailing list