Pre-PEP: Dictionary accumulator methods
Beni Cherniavsky
cben at users.sf.net
Sat Mar 19 22:05:14 EST 2005
Alexander Schmolck wrote:
> "Raymond Hettinger" <vze4rx4y at verizon.net> writes:
>
>>The rationale is to replace the awkward and slow existing idioms for dictionary
>>based accumulation:
>>
>> d[key] = d.get(key, 0) + qty
>> d.setdefault(key, []).extend(values)
>>
Indeed not too readable. The try..except version is better but is too
verbose. There is a simple concept underneath of assuming a default value and
we need "one obvious" way to write it.
>>In simplest form, those two statements would now be coded more readably as:
>>
>> d.count(key)
>> d.appendlist(key, value)
>
> Yuck.
>
-1 from me too on these two methods because they only add "duct tape" for the
problem instead of solving it. We need to improve upon `dict.setdefault()`,
not specialize it.
> The relatively recent "improvement" of the dict constructor signature
> (``dict(foo=bar,...)``) obviously makes it impossible to just extend the
> constructor to ``dict(default=...)`` (or anything else for that matter) which
> would seem much less ad hoc. But why not use a classmethod (e.g.
> ``d=dict.withdefault(0)``) then?
>
You mean giving a dictionary a default value at creation time, right?
Such a dictionary could be used very easily, as in <gasp>Perl::
foreach $word ( @words ) {
$d{$word}++; # default of 0 assumed, simple code!
}
</gasp>. You would like to write::
d = dict.withdefault(0) # or something
for word in words:
d[word] += 1 # again, simple code!
I agree that it's a good idea but I'm not sure the default should be specified
at creation time. The problem with that is that if you pass such a dictionary
into an unsuspecting function, it will not behave like a normal dictionary.
Also, this will go awry if the default is a mutable object, like ``[]`` - you
must create a new one at every access (or introduce a rule that the object is
copied every time, which I dislike). And there are cases where in different
points in the code operating on the same dictionary you need different default
values.
So perhaps specifying the default at every point of use by creating a proxy is
cleaner::
d = {}
for word in words:
d.withdefault(0)[word] += 1
Of course, you can always create the proxy once and still pass it into an
unsuspecting function when that is actually what you mean.
How should a dictionary with a default value behave (wheter inherently or a
proxy)?
- ``d.__getattr__(key)`` never raises KeyError for missing keys - instead it
returns the default value and stores the value as `d.setdefult()` does.
This is needed for make code like::
d.withdefault([])[key].append(foo)
to work - there is no call of `d.__setattr__()`, so `d.__getattr__()` must
have stored it.
- `d.__setattr__()` and `d.__delattr__()` behave normally.
- Should ``key in d`` return True for all keys? It is desiarable to have
*some* way to know whether a key is really present. But if it returns False
for missing keys, code that checks ``key in d`` will behave differently from
normally equivallent code that uses try..except. If we use the proxy
interface, we can always check on the original dictionary object, which
removes the problem.
- ``d.has_key(key)`` must do whatever we decide ``key in d`` does.
- What should ``d.get(key, [default])`` and ``d.setdefault(key, default)``
do? There is a conflict between the default of `d` and the explicitly given
default. I think consistency is better and these should pretend that `key`
is always present. But either way, there is a subtle problem here.
- Of course `iter(d)`, `d.items()` and the like should only see the keys
that are really present (the alternative inventing an infinite amount of
items out of the blue is clearly bogus).
If the idea that the default should be specified in every operation (creating
a proxy) is accepted, there is a simpler and more fool-proof solution: the
ptoxy will not support anything except `__getitem__()` and `__setitem__()` at
all. Use the original dictionary for everything else. This prevents subtle
ambiguities.
> Or, for the first and most common case, just a bag type?
>
Too specialized IMHO. You want a dictionary with any default anyway. If you
have that, what will be the benefit of a bag type?
More information about the Python-list
mailing list