how are dictionary literals handled by the interpreter?

akameswaran at gmail.com akameswaran at gmail.com
Wed Sep 13 17:32:10 EDT 2006


Jason wrote:
> akameswaran at gmail.com wrote:
> > I wrote up a quick little set of tests, I was acutally comparing ways
> > of doing "case" behavior just to get some performance information.  Now
> > two of my test cases had almost identical results which was not at all
> > what I expected.  Ultimately I realized I don't really know how
> > literals are treated within the interpreter.
> >

the lengthy dictionaries were due to my initial testing comparing
if/ifelse constructs to dictionary cases.. btw if/ifelse seems a lot
faster till you get to at least 12 cases, and for most applications,
the number of cases would be even higher before dictionaries beat a
bunch of ifs

> > The two implementations I was looking at were:
> >
> > class caseFunction(object):
> >     def __init__(self):
> >         self.caseDict = {'a':"retval = 'a'",
> > 'b':"retval='b'","c":"retval='c'","d":"retval='d'",
> >
> > "e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
> >                          "i":"retval='i'"}
> >
> >     def doIt(self,a):
> >         exec(self.caseDict.get(a))
> >         return retval
> >
> >
> >
> > def caseFunc3(a):
> >     exec(  {'a':"retval = 'a'",
> > 'b':"retval='b'","c":"retval='c'","d":"retval='d'",
> >
> > "e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
> >                          "i":"retval='i'"}.get(a))
> >     return retval
> >
> >
> > I had expected caseFunc3 to be slower.  I had thought the interpreter
> > would be recreating the dictionary each time, but that doesn't seem to
> > be the case since performance of the class version and the function
> > version are nearly identical on most runs.  If i rewrite caseFunc3 as:
> >
> > def caseFunc4(a):
> >     exec(  dict({'a':"retval = 'a'",
> > 'b':"retval='b'","c":"retval='c'","d":"retval='d'",
> >
> > "e":"retval='e'","f":"retval='f'","g":"retval='g'","h":"retval='h'",
> >                          "i":"retval='i'"}).get(a))
> >     return retval
> >
> > now with the explicit use of dict, i see the performace of the
> > functional version decline as I initially expected.
> >
> > So what is happeneing in caseFunc3.  It seems as though the literal is
> > "cached".  The other hypothesis I came up with is the name lookup for
> > self.caseDict takes the same amount of time as creating the dictionary
> > literal - but that doesn't make sense to me.
> >
> > Thanks
>
> Why not check to see what the interpreter is doing?  Rather than
> dealing with your overly complicated dictionaries, I've made a simple,
> one case dictionary.  I've also done a similar bit to replicate your
> doIt method.
>
Thanks, i didn't know about dis.  So this is very useful.

> >>> def case3(a):
> ...     exec( {'a': "retval = 'a'"}.get(a) )
> ...     return retval
> ...
> >>> case3('a')
> 'a'
> >>> def case4(a):
> ...     exec( dict({'a': "retval = 'a'"}).get(a) )
> ...     return retval
> ...
> >>> case4('a')
> 'a'
> >>> class caseFunction(object):
> ...     def doIt(self, a):
> ...         exec(self.caseDict.get(a))
> ...         return retval
> ...
>
> Then, use the dis module to disassemble the function objects:
> >>> import dis
> >>> dis.dis(case3)
>   2           0 BUILD_MAP                0
>               3 DUP_TOP
>               4 LOAD_CONST               1 ('a')
>               7 LOAD_CONST               2 ("retval = 'a'")
>              10 ROT_THREE
>              11 STORE_SUBSCR
>              12 LOAD_ATTR                0 (get)
>              15 LOAD_FAST                0 (a)
>              18 CALL_FUNCTION            1
>              21 LOAD_CONST               0 (None)
>              24 DUP_TOP
>              25 EXEC_STMT
>
>   3          26 LOAD_NAME                2 (retval)
>              29 RETURN_VALUE
> >>> dis.dis(case4)
>   2           0 LOAD_NAME                0 (dict)
>               3 BUILD_MAP                0
>               6 DUP_TOP
>               7 LOAD_CONST               1 ('a')
>              10 LOAD_CONST               2 ("retval = 'a'")
>              13 ROT_THREE
>              14 STORE_SUBSCR
>              15 CALL_FUNCTION            1
>              18 LOAD_ATTR                1 (get)
>              21 LOAD_FAST                0 (a)
>              24 CALL_FUNCTION            1
>              27 LOAD_CONST               0 (None)
>              30 DUP_TOP
>              31 EXEC_STMT
>
>   3          32 LOAD_NAME                3 (retval)
>              35 RETURN_VALUE
> >>> dis.dis(caseFunction.doIt)
>   3           0 LOAD_FAST                0 (self)
>               3 LOAD_ATTR                1 (caseDict)
>               6 LOAD_ATTR                2 (get)
>               9 LOAD_FAST                1 (a)
>              12 CALL_FUNCTION            1
>              15 LOAD_CONST               0 (None)
>              18 DUP_TOP
>              19 EXEC_STMT
>
>   4          20 LOAD_NAME                4 (retval)
>              23 RETURN_VALUE
> >>>
>
> Take a look at what happens before the 'get' attribute is loaded in
> each case.  In case 3, you've simply created a dictionary literal,
> which is a very fast operation under Python.  In case 4, you've created
> a dictionary literal, then you call the dict() function.  The dict
> function will create a dictionary from the supplied dictionary, and
> return the shallow copy.
>
I knew case4 would be slower, it did what it intended, but didn't
answer my question.  How's that for confused thinking!!

> Case 3 is slower, but the Python developers have worked to make
> dictionary creation and look-up very fast.  Did you use the timeit
> module to test your functions?
>
I had a simple timer, not timeit.

def timeIt(func,vals):
    start = time.time()
    for item in vals:
        func(item)
    return time.time()-start

#reinvented the wheel too, not a good day to program

The output provided was a good starting point, and unless I am really
misreading things, python is not caching my dictionary literal.  It
would appear as though it gets loaded every iteration - which is what I
expected.

So here's the odd part - and one I still find remarkably
counter-intuitive.  and i'll set up some of the params here

1) I did not count object instantiation in the time taken - so in the
class based version the dictionary was already created before I start
timing

2) Once the dictionary is either created (function) or retreived(class)
the execution is identical.  That can be seen from the output above you
provided.

which lead me to hypothesize:
3) loading a dictionary literal seems faster than accessing that
dictionary as a class member.

I really find 3 to be surprising although I guess I shouldn't - lookups
are not cheap... I repeat that to myself all the time.  apparently they
really really aren't cheap.

Now that begs the question... and I'm writing something up to test it.
Will a larger dictionary change this dynamic.  My suspicion yes, how
much bigger I have no idea.


and mostly thanks for the intro to a new module.  With all the esoteric
junk I play with and experiment with I can't believe I didn't stumble
on it before.




More information about the Python-list mailing list