[Python-3000] PEP 3106: Revamping dict.keys(), .values() and .items()

Wed Dec 20 12:22:29 CET 2006

Guido van Rossum wrote:
> I've written a quick version of PEP 3106, which expresses my ideas
> about how the dict methods to access keys, values and items should be
> redone.
> 
> The text is in svn:
> http://svn.python.org/view/peps/trunk/pep-3106.txt?rev=53096&view=markup
> 
> At some point it will appear on python.org: http://python.org/dev/peps/pep-3106/
> 
> Comments please? (Or we can skip the comments and go straight to the
> implementation stage. Patch anyone?)

Looks pretty good. Some specific comments:

1. There's no reason to implement __contains__ on d.values() - the pseudocode 
currently in the PEP is what the interpreter already falls back to doing if 
__iter__ is provided but __contains__ isn't.

2. At least initially, I'd prefer to make the immutability of d.values() 
total: no pop(), no clear(). I think this rule (d.values() doesn't let you 
modify the underlying dict at all) is easier to remember than remembering that 
pop() & clear() are permitted, but everything else that d.keys() and d.items() 
permit is disallowed.

3. The definition of d.items().copy() as set(self) may have a problem: the 
values in the original dict may not be hashable, so set(d.items()) may fail 
with a TypeError. If we want d.items().copy() and d.values().copy() to work 
without potentially raising TypeError we would need two new data types 
(probably in the collections module):
   a. A keyed set: something that behaves like a set, but accepts a callable 
that describes how to retrieve the key value from the items in the set. Then 
'd.items().copy()' would be equivalent to 'collections.keyedset(d.items(), 
key=operator.itemgetter(0))'.
   b. A non-hashing multiset: dict values aren't guaranteed to be hashable, so 
a defaultdict-based bag implementation wouldn't help.

4. Given the previous point, perhaps we should just drop the idea of providing 
a copy() method at all on any of d.keys(), d.items() or d.values()?
If the dict's values are hashable (and we include a multiset implementation in 
collections), then 'set(d.keys())', 'set(d.items())' and 
'multiset(d.values())' will be just as clear as a copy() method, and the 
TypeError you get if the values in the original dictionary aren't hashable 
will likely be less surprising. This approach would also be closer to the 
copying approach that tolerates non-hashable values in the original dict: 
'set(d.keys())', 'list(d.items())' and 'list(d.values())'.

I'm personally somewhat ambivalent regarding points 3 & 4 - I like the idea of 
having a corresponding concrete container class with similar semantics for 
each of the three alternate dictionary views, and once we have those then 
there's little reason not to provide the copy methods. OTOH, the growing 
prevalence of copy methods on standard library container types would then 
start to make me wonder whether or not copy.copy() (or a less forgiving 
equivalent designed specifically to copy containers) should simply be made a 
builtin, such that the OOW to do a shallow copy of any container would be 
'copy(x)'. Option 4 ducks that particular issue by making it easy to avoid 
adding the two new container types.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org