[Python-Dev] another dict crasher

Tim Peters tim.one@home.com
Sun, 3 Jun 2001 00:04:57 -0400


[Michael Hudson]
> Ah, like this one:
>
> dict = {}
>
> # let's force dict to malloc its table
> for i in range(1,10):
>     dict[i] = i
>
> class Machiavelli2:
>     def __eq__(self, other):
>         dict.clear()
>         return 1
>     def __hash__(self):
>         return 0
>
> dict[Machiavelli2()] = Machiavelli2()
>
> print dict[Machiavelli2()]

Told you it was easy <wink>.

> I'll attach a patch, but it's another branch inside lookdict (though
> not lookdict_string which is I guess the really performance sensitive
> one).

lookdict_string is crucial to Python's own performance.  Dicts indexed by
ints or class instances or ... are vital to other apps.

> Index: dictobject.c
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Objects/dictobject.c,v
> retrieving revision 2.100
> diff -c -1 -r2.100 dictobject.c
> *** dictobject.c        2001/06/02 08:27:39     2.100
> --- dictobject.c        2001/06/02 11:36:47
> ***************
> *** 273,274 ****
> --- 273,281 ----
>                         cmp =
> PyObject_RichCompareBool(ep->me_key, key, Py_EQ);
> +                       if (ep0 != mp->ma_table) {
> +                               PyErr_SetString(PyExc_RuntimeError,
> +                                               "dict resized on
> comparison");
> +                               ep = mp->ma_table;
> +                               while (ep->me_value) ep++;
> +                               return ep;
> +                       }
>                         if (cmp > 0) {
> ***************
> *** 310,311 ****
> --- 317,325 ----
>                         cmp =
> PyObject_RichCompareBool(ep->me_key, key, Py_EQ);
> +                       if (ep0 != mp->ma_table) {
> +                               PyErr_SetString(PyExc_RuntimeError,
> +                                               "dict resized on
> comparison");
> +                               ep = mp->ma_table;
> +                               while (ep->me_value) ep++;
> +                               return ep;
> +                               }
>                         if (cmp > 0) {

Then we have other problems.  Note the comment before lookdict:

    Exceptions are never reported by this function, and outstanding
    exceptions are maintained.

The patched code doesn't preserve that.

Looking for "the first" unused or dummy slot isn't good enough either, as
surely the user has the right to expect that after, e.g., d[m] = 1, d[m]
retrieves 1.  That is, picking a reusable slot "at random" doesn't respect
the *semantics* of dict operations ("just because" the dict resized doesn't
mean the key they're looking for went away!).  It would be better in this
case to go back to the top and start over.  However, then an adversarial
user can construct a case that never terminates.  Unclear what to do.