Swapping Content of Two Dictionaries.

Hatem Oraby hatem.oraby at gmail.com
Wed Mar 17 07:26:33 EDT 2010


Hello, I want to swap the content of two dictionaries, the obvious way to do
it is:
a = {1:"I'am A"}
b = {2:"I'm B"}
temp = a
a = b
b = temp

However, consider the case in which the dictionary we are referencing lives
in another module:
#external.py
#
#a = {1:"I'am A}
import external

temp = external.a
external.a = b
b = temp

Well, still this would work great and anyone later that would refereence
external.a would find the content of b, but consider the case where someone
else imported "external.py" before my module run, for example:
#external.py
#
#a = {1:"I'm A}

#annoyer.py
#from external import a
#
# #Do other stuff with a
import external

temp = external.a
external.a = b
b = temp

Now in such case when I overwrite the content of external.a in my module,
the change in dictionary is not reflected in annoyer module, that's because
annoyer hold a reference to the original a but not the new one. annoyer
module still see the old content of a.

The workaround that i did to solve this problem is the following (I'm
simplifying it right here):

tempKeys = a.keys()
diffKeys = a.keys()  - b.keys()   #I do it using set()s

#item = a
temp = {}
for key in a.keys():
   item[key] = a[key]

for key in diffKeys:
    del a[key] #delete stuff that exist in a but not b.

for key in b.keys():
    a[key] = b[key]

b = temp

This works great as the content referenced by the dictionary is changed
rather than changing the reference of dictionary itself so it's reflected by
any "scope" that references the dictionary.

My problem is that i need to do this operation a LOT, simply I got a problem
that my program go through a very long loop and inside this loop this
operation is needed to be done twice and the dictionary size ain't very
small.

If anyone has a suggestion one how to do it faster then please feel free and
welcome to contribute.

Anyway, I decided to go and implement it in C to gain performance, And since
I got no expereince in writing C extension I'm not really sure if what I'm
doing right or wrong, long story short here is what i decided to do:
In real python (I mean not the C code) an object is mainly a reference to a
C PyObject (or an inheritance of the PyObject).
Python dictionaries are represented as PyDictObject (PyDictObject inherits
PyObject), so every Python Dictionary is a reference to PyDictObject,

So what i came up with to swap two Python dictionaries is to swap the
content of PyDictObjects of the dictionaries, I know that "illegal" way of
doing things in Python but I need the hack to gain performance.

I opened up python source files:
object.h (extracted the definition of PyObject)
dictObject.g (extctracted the definition of PyDictObject)

And I've written the following C extension:


   1. #include <Python.h>
   2. #include <dictobject.h> //I know that it was already imported
   3.
   4. static char swap_doc[] =
   5. "My try to swap dicts, isa hopefully it will work good.";
   6.
   7.
   8. static PyObject*
   9. swap_swapDict(PyObject *self, PyObject *args)
   10. {
   11.         PyObject *a, *b;
   12.         PyDictObject *x, *y;
   13.         //Temp PyDictObject attriubutes
   14.         Py_ssize_t ob_refcnt;
   15.         PyTypeObject *ob_type;
   16.         Py_ssize_t ma_fill;
   17.         Py_ssize_t ma_used;
   18.         Py_ssize_t ma_mask;
   19.         PyDictEntry *ma_table;
   20.         PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key,
   long hash);
   21.         PyDictEntry *ma_smalltable;//[PyDict_MINSIZE];
   22.
   23.
   24.         if (!PyArg_UnpackTuple(args, "swapDict", 2, 2, &a, &b)) {
   25.                 return NULL;
   26.         }
   27.
   28.         //Make sure that they are dictionaries
   29.         if (!PyDict_Check(a) && !PyDict_Check(b)) {
   30.                 PyErr_SetString(PyExc_TypeError,
   31.                                 "dictionary required, something else
   was passed");
   32.                 return NULL;
   33.         }
   34.
   35.         x = (PyDictObject*)a;
   36.         y = (PyDictObject*)b;
   37.
   38.         ob_refcnt = x->ob_refcnt;
   39.         x->ob_refcnt = y->ob_refcnt;
   40.         y->ob_refcnt = ob_refcnt;
   41.
   42.         ob_type = x->ob_type;
   43.         x->ob_type = y->ob_type;
   44.         y->ob_type = ob_type;
   45.
   46.         ma_fill = x->ma_fill;
   47.         x->ma_fill = y->ma_fill;
   48.         y->ma_fill = ma_fill;
   49.
   50.         ma_used = x->ma_used;
   51.         x->ma_used = y->ma_used;
   52.         y->ma_used = ma_used;
   53.
   54.         ma_mask = x->ma_mask;
   55.         x->ma_mask = y->ma_mask;
   56.         y->ma_mask = ma_mask;
   57.
   58.         ma_table = x->ma_table;
   59.         x->ma_table = y->ma_table;
   60.         y->ma_table = ma_table;
   61.
   62.         ma_lookup = x-> ma_lookup;
   63.         x->ma_lookup = y-> ma_lookup;
   64.         y->ma_lookup = ma_lookup;
   65.
   66.         ma_smalltable = x->ma_smalltable;
   67.         *(x->ma_smalltable) = *(y->ma_smalltable);
   68.         *(y->ma_smalltable) = *(ma_smalltable);
   69.
   70.         return Py_None;
   71. }
   72.
   73. //This is extracted form dictObject.h and object.h
   74. //typedef struct _dictobject PyDictObject;
   75. //struct _dictobject {
   76. //      PyObject_HEAD
   77. //      Py_ssize_t ma_fill;  /* # Active + # Dummy */
   78. //      Py_ssize_t ma_used;  /* # Active */
   79. //
   80. //      /* The table contains ma_mask + 1 slots, and that's a power
   of 2.
   81. //       * We store the mask instead of the size because the mask is
   more
   82. //       * frequently needed.
   83. //       */
   84. //      Py_ssize_t ma_mask;
   85. //
   86. //      /* ma_table points to ma_smalltable for small tables, else to
   87. //       * additional malloc'ed memory.  ma_table is never NULL!
   This rule
   88. //       * saves repeated runtime null-tests in the workhorse getitem
   and
   89. //       * setitem calls.
   90. //       */
   91. //      PyDictEntry *ma_table;
   92. //      PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key,
   long hash);
   93. //      PyDictEntry ma_smalltable[PyDict_MINSIZE];
   94. //};
   95. //PyObject_Head is a simple two statements equal to:
   96. //Py_ssize_t ob_refcnt;
   97. //struct _typeobject *ob_type;
   98.
   99. static char swap_swapDict_doc[] =
   100. "Replaces two dicts with each other.";
   101.
   102. static PyMethodDef swap_methods[] = {
   103.         {"swapDict", swap_swapDict, METH_VARARGS, swap_swapDict_doc}
   ,
   104.         {NULL, NULL}
   105. };
   106.
   107. PyMODINIT_FUNC
   108. initswap(void)
   109. {
   110.         Py_InitModule3("swap", swap_methods, swap_doc);
   111. }



What the code mainly do is that it swap the content of the PyDict objects.
However I got a bug where one of them swaps correctly while the other points
to a displaced memory location.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100317/bfaf769f/attachment.html>


More information about the Python-list mailing list