__lt__ slowing the "in" operator even if not called

andrewdalke at gmail.com andrewdalke at gmail.com
Thu Jun 15 11:06:32 EDT 2006


Emanuele Aina wrote:
> I have some code which does a lot of "in" on lists containing objects
> with no __eq__ defined.
>
> It all goes fast until I add the __lt__() method: then I have a
> slowdown comparable to the one I get using the overridden __eq__, while
> the __lt__ method is never called.
>
> Someone can explain me why?

The list's __contains__ method is very simple

        for (i = 0, cmp = 0 ; cmp == 0 && i < a->ob_size; ++i)
                cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a,
i),
                                                   Py_EQ);

The PyObject_RichCompareBool becomes more complex.  The
relevant code occurs after the check for object identity.  The next
step is

        res = PyObject_RichCompare(v, w, op);

which goes into your case

        /* If the types are equal, and not old-style instances, try to
           get out cheap (don't bother with coercions etc.). */
        if (v->ob_type == w->ob_type && !PyInstance_Check(v)) {
                cmpfunc fcmp;
                richcmpfunc frich = RICHCOMPARE(v->ob_type);
                /* If the type has richcmp, try it first.
try_rich_compare
                   tries it two-sided, which is not needed since we've
a
                   single type only. */
                if (frich != NULL) {
                        res = (*frich)(v, w, op);
                        if (res != Py_NotImplemented)
                                goto Done;
                        Py_DECREF(res);
                }
                /* No richcmp, or this particular richmp not
implemented.
                   Try 3-way cmp. */
                fcmp = v->ob_type->tp_compare;

One new-style class defines '__lt__' while the other does not.

Here's where things get shaky for me.  I think new-style classes
are instances of PyInstanceObject (and not PyClassObject).  Instances
use 'instance_richcompare' to implement the rich comparison
between two instances.  This function does the lookup for '__eq__'
in the two classes.

The tp_richcompare slot is set to instance_richcompare when any
of __lt__, __gt__, __eq_, etc. are defined in a new type.  The macro-
based code looks like

        TPSLOT("__lt__", tp_richcompare, slot_tp_richcompare,
richcmp_lt,
               "x.__lt__(y) <==> x<y"),
        TPSLOT("__le__", tp_richcompare, slot_tp_richcompare,
richcmp_le,
               "x.__le__(y) <==> x<=y"),
        TPSLOT("__eq__", tp_richcompare, slot_tp_richcompare,
richcmp_eq,
               "x.__eq__(y) <==> x==y"),
        TPSLOT("__ne__", tp_richcompare, slot_tp_richcompare,
richcmp_ne,
               "x.__ne__(y) <==> x!=y"),
        TPSLOT("__gt__", tp_richcompare, slot_tp_richcompare,
richcmp_gt,
               "x.__gt__(y) <==> x>y"),
        TPSLOT("__ge__", tp_richcompare, slot_tp_richcompare,
richcmp_ge,
               "x.__ge__(y) <==> x>=y"),

So if you define "__lt__" in your object then the type gets a richcmp
function and your == test implicit in the 'in' search always incurs the
cost of figuring out that "__eq__" is not defined.


                                Andrew
                                dalke at dalkescientific.com




More information about the Python-list mailing list