lambda

Wed Jan 19 08:14:03 EST 2005

Antoon Pardon wrote:
	> I can be wrong, but until now I have seen no indication that I was
> using mutable and immutable differently than other people. AFAICT
> we all refer to whether an object belongs to a mutable or immutable
> class.

The difference is that when you take a copy of the key and stick it in the 
dictionary, such that the dictionary now holds the *sole* reference to that key, 
you have made that key *effectively* immutable.

This is why no-one really batted an eyelid when you mentioned that mutable keys 
can be used safely by making a copy - doing so makes the key *effectively* 
immutable, and means that modifying the original key object (i.e. the 
application's copy, not the dict's copy), and reusing it to look up in the 
dictionary is likely to give a KeyError.

These semantics would be understandably surprising to many users, and hence, are 
not supplied by default.

Additionally, a dictionary's keys are accessible via its API. Accordingly, to 
preserve this 'effective immutability', making a copy on key input is 
insufficient - keys must be copied on *output* as well (that is, dict.keys, 
dict.items etc must return *copies* of the key objects, not the key objects 
themselves).

Since there is no reliable way in Python to tell if an object is mutable or not 
(the closest equivalent is the presence of __hash__, which clearly can't be used 
in this example), this copying would need to be done for *every* object.

Alternately, the dictionary can say to the API clients, "make an immutable copy 
and supply that instead. It is left to API clients to decide how best to make 
the immutable version". The API client copies the key once (to make the 
immutable version), and everyone lives happily ever after.

For example:

Py> class mylist(list):
...   def __init__(self, arg):
...     super(mylist, self).__init__(arg)
...     self._tuple = None
...   def frozen(self):
...     if self._tuple is None:
...       self._tuple = tuple(self)
...     return self._tuple
...   def unfreeze(self):
...     self._tuple = None
...
Py> x = mylist(range(10))
Py> x
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.append(10)
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Py> x.unfreeze()
Py> x.frozen()
(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

This is much safer than having a list subclass that implements __hash__, and 
still lets you avoid redundant copy operations.

Nicely, so long as you don't unfreeze the object you used to key the dictionary, 
  reusing the same object will always get you the correct dictionary entry, and 
two lists that compare equal at the time they are frozen will also get you the 
same dictionary entry. The equivalent tuples can be used for the lookup, too.

> I also don't want my values to change when I have sorted a list
> and still need to apply a number of algorithms that rely on
> that. Nobody seems to have problems with the possibility that
> the list items are mutable (and called such).

OK, to make this comparison of sorted lists and dictionaries fair:

Write a sorted_list class that is like a regular Python list, but maintains as 
an invariant that the list contents will stay sorted.

See how well you go maintaining that invariant while allowing mutable objects in 
the list. The only way would be to copy them in when they're supplied, and copy 
them out again when you're done. Otherwise, there is no way the class can keep 
its promise. The performance will be lousy, since __setitem__ and __getitem__ 
will be making copies all the time.

Alternatively, the class could declare itself to work reliably only with 
immutable objects. Performance will improve, since copies need only be made when 
an object *actually* changes (and the old immutable copy is deleted and the new 
version inserted in its place).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net