default behavior

Fri Jul 30 23:47:49 EDT 2010

On Fri, 30 Jul 2010 08:34:52 -0400, wheres pythonmonks wrote:

> Sorry, doesn't the following make a copy?
> 
>>>>> from collections import defaultdict as dd x = dd(int)
>>>>> x[1] = 'a'
>>>>> x
>> defaultdict(<type 'int'>, {1: 'a'})
>>>>> dict(x)
>> {1: 'a'}
>>
>>
>>
> 
> I was hoping not to do that -- e.g., actually reuse the same underlying
> data.  

It does re-use the same underlying data.

>>> from collections import defaultdict as dd
>>> x = dd(list)
>>> x[1].append(1)
>>> x
defaultdict(<type 'list'>, {1: [1]})
>>> y = dict(x)
>>> x[1].append(42)
>>> y
{1: [1, 42]}

Both the defaultdict and the dict are referring to the same underlying 
key:value pairs. The data itself isn't duplicated. If they are mutable 
items, a change to one will affect the other (because they are the same 
item). An analogy for C programmers would be that creating dict y from 
dict y merely copies the pointers to the keys and values, it doesn't copy 
the data being pointed to.

(That's pretty much what the CPython implementation does. Other 
implementations may do differently, so long as the visible behaviour 
remains the same.)

> Maybe dict(x), where x is a defaultdict is smart?  I agree that a
> defaultdict is safe to pass to most routines, but I guess I could
> imagine that a try/except block is used in a bit of code where on the
> key exception (when the value is absent)  populates the value with a
> random number.  In that application, a defaultdict would have no random
> values.

If you want a defaultdict with a random default value, it is easy to 
provide:

>>> import random
>>> z = dd(random.random)
>>> z[2] += 0
>>> z
defaultdict(<built-in method random of Random object at 0xa01e4ac>, {2: 
0.30707092626033605})

The point which I tried to make, but obviously failed, is that any piece 
of code has certain expectations about the data it accepts. If take a 
function that expects an int between -2 and 99, and instead decide to 
pass a Decimal between 100 and 150, then you'll have problems: if you're 
lucky, you'll get an exception, if you're unlucky, it will silently give 
the wrong results. Changing a dict to a defaultdict is no different.

If you have code that *relies* on getting a KeyError for missing keys:

def who_is_missing(adict):
    for person in ("Fred", "Barney", "Wilma", "Betty"):
        try:
            adict[person]
        except KeyError:
            print person, "is missing"

then changing adict to a defaultdict will cause the function to 
misbehave. That's not unique to dicts and defaultdicts.

> Besides a slightly different favor, does the following have applications
> not covered by defaultdict?
> 
> m.setdefault('key', []).append(1)

defaultdict calls a function of no arguments to provide a default value. 
That means, in practice, it almost always uses the same default value for 
any specific dict.

setdefault takes an argument when you call the function. So you can 
provide anything you like at runtime.

> I think I am unclear on the difference between that and:
> 
> m['key'] = m.get('key',[]).append(1)

Have you tried it? I guess you haven't, or you wouldn't have thought they 
did the same thing.

Hint -- what does [].append(1) return?

-- 
Steven