Behavioural identity - a short discussion

Kay Schluehr kay.schluehr at gmx.net
Sun Apr 10 16:37:22 EDT 2005


In mathematics two functions can be considered equal when they
represent the same function graph. This is nothing but a
set-theoretical identity. It is a nice criterion in theory bad a bad
one in practice because it is impossible to calculate all values of an
arbitrary function and this is true not only in practice but also in
theory. So mathematicians start to distinguish certain classes of
functions e.q. polynomials or power-series and prove identity theorems
about objects in those classes.

But what can be said about the equality of two arbitrary
Python-functions f and g? First of all not very much. If we define the
trivial functions

def f():pass
def g():pass

and we ask for equality the hash values will be compared and show us
that f and g are different.

On the other hand if we disassemble f and g we receive a completely
different picture:

>>> dis.dis(f)
  1           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE

>>> dis.dis(g)
  1           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE

This remains true if we add arguments to f:

def f(x):pass

>>> dis.dis(f)
  1           0 LOAD_CONST               0 (None)
              3 RETURN_VALUE

Inspecting a function using dis.dis() enables us to speak about it's
"behavioural idenity".

What is it good for? Answer: for using implementations as interfaces.

Let's consider following classes:

class NotImplemented(Exception):
    pass

class A(object):
    def __init__(self):
        raise NotImplemented

We can regard class A as a "pure abstract" class. It is impossible to
create instances of A. Each subclass of A that wants to be instantiated
must override __init__. This is clearly a property of A. But a client
object that inspects A by checking the availability of methods and
scanning argument signatures will remain insensitive to this simple
fact. Thinking in Python makes live easier because we can not only
check interfaces superficially but we can inspect the code and we can
compare two code-objects on the behavioural level with a certain
accuracy.

We start with a function

def not_implemented():
    raise NotImplemented

and we are not interested in the name or in the argument-signature that
remains empty but in the implementation of not_implemented() as being
prototypical for other functions.

A variant of the dis.disassemble() function ( see [1]) deliveres:

['LOAD_GLOBAL', 'NotImplemented', 'RAISE_VARARGS', 'LOAD_CONST', None,
'RETURN_VALUE']

Analyzing A.__init__ will create exactly the same token stream. A
client object that compares the token streams of __init__ and
not_implemented holds a sufficient criterion for the abstractness of A.



[1] Implementation of a stripped down variant of the dis.disassemble()
function:

def distrace(co):
    "trace a code object"
    code = co.co_code
    n = len(code)
    i = 0
    extended_arg = 0
    free = None
    while i < n:
        c = code[i]
        op = ord(c)
        yield opname[op]
        i = i+1
        if op >= HAVE_ARGUMENT:
            oparg = ord(code[i]) + ord(code[i+1])*256 + extended_arg
            extended_arg = 0
            i = i+2
            if op == EXTENDED_ARG:
                extended_arg = oparg*65536L
            if op in hasconst:
                yield co.co_consts[oparg]
            elif op in hasname:
                yield co.co_names[oparg]
            elif op in hasjrel:
                yield (i,oparg)
            elif op in haslocal:
                yield co.co_varnames[oparg]
            elif op in hascompare:
                yield cmp_op[oparg]
            elif op in hasfree:
                if free is None:
                    free = co.co_cellvars + co.co_freevars
                yield free[oparg]


>>> list(distrace(A.__init__.func_code))
['LOAD_GLOBAL', 'NotImplemented', 'RAISE_VARARGS', 'LOAD_CONST', None,
'RETURN_VALUE']


Ciao,
Kay




More information about the Python-list mailing list