|Title:||Add a frozendict builtin type|
|Author:||Victor Stinner <victor.stinner at gmail.com>|
I'm rejecting this PEP. A number of reasons (not exhaustive):
- According to Raymond Hettinger, use of frozendict is low. Those that do use it tend to use it as a hint only, such as declaring global or class-level "constants": they aren't really immutable, since anyone can still assign to the name.
- There are existing idioms for avoiding mutable default values.
- The potential of optimizing code using frozendict in PyPy is unsure; a lot of other things would have to change first. The same holds for compile-time lookups in general.
- Multiple threads can agree by convention not to mutate a shared dict, there's no great need for enforcement. Multiple processes can't share dicts.
- Adding a security sandbox written in Python, even with a limited scope, is frowned upon by many, due to the inherent difficulty with ever proving that the sandbox is actually secure. Because of this we won't be adding one to the stdlib any time soon, so this use case falls outside the scope of a PEP.
On the other hand, exposing the existing read-only dict proxy as a built-in type sounds good to me. (It would need to be changed to allow calling the constructor.) GvR.
Update (2012-04-15): A new MappingProxyType type was added to the types module of Python 3.3.
Add a new frozendict builtin type.
A frozendict is a read-only mapping: a key cannot be added nor removed, and a key is always mapped to the same value. However, frozendict values can be not hashable. A frozendict is hashable if and only if all values are hashable.
- Immutable global variable like a default configuration.
- Default value of a function parameter. Avoid the issue of mutable default arguments.
- Implement a cache: frozendict can be used to store function keywords. frozendict can be used as a key of a mapping or as a member of set.
- frozendict avoids the need of a lock when the frozendict is shared by multiple threads or processes, especially hashable frozendict. It would also help to prohibe coroutines (generators + greenlets) to modify the global state.
- frozendict lookup can be done at compile time instead of runtime because the mapping is read-only. frozendict can be used instead of a preprocessor to remove conditional code at compilation, like code specific to a debug build.
- frozendict helps to implement read-only object proxies for security modules. For example, it would be possible to use frozendict type for __builtins__ mapping or type.__dict__. This is possible because frozendict is compatible with the PyDict C API.
- frozendict avoids the need of a read-only proxy in some cases. frozendict is faster than a proxy because getting an item in a frozendict is a fast lookup whereas a proxy requires a function call.
- frozendict has to implement the Mapping abstract base class
- frozendict keys and values can be unorderable
- a frozendict is hashable if all keys and values are hashable
- frozendict hash does not depend on the items creation order
- Add a PyFrozenDictObject structure based on PyDictObject with an extra "Py_hash_t hash;" field
- frozendict.__hash__() is implemented using hash(frozenset(self.items())) and caches the result in its private hash attribute
- Register frozendict as a collections.abc.Mapping
- frozendict can be used with PyDict_GetItem(), but PyDict_SetItem() and PyDict_DelItem() raise a TypeError
To ensure that a frozendict is hashable, values can be checked before creating the frozendict:
import itertools def hashabledict(*args, **kw): # ensure that all values are hashable for key, value in itertools.chain(args, kw.items()): if isinstance(value, (int, str, bytes, float, frozenset, complex)): # avoid the compute the hash (which may be slow) for builtin # types known to be hashable for any value continue hash(value) # don't check the key: frozendict already checks the key return frozendict.__new__(cls, *args, **kw)
namedtuple may fit the requiements of a frozendict.
A namedtuple is not a mapping, it does not implement the Mapping abstract base class.
frozendict can be implemented in Python using descriptors" and "frozendict just need to be practically constant.
If frozendict is used to harden Python (security purpose), it must be implemented in C. A type implemented in C is also faster.
The PEP 351 was rejected.
The PEP 351 tries to freeze an object and so may convert a mutable object to an immutable object (using a different type). frozendict doesn't convert anything: hash(frozendict) raises a TypeError if a value is not hashable. Freezing an object is not the purpose of this PEP.
Python has a builtin dictproxy type used by type.__dict__ getter descriptor. This type is not public. dictproxy is a read-only view of a dictionary, but it is not read-only mapping. If a dictionary is modified, the dictproxy is also modified.
dictproxy can be used using ctypes and the Python C API, see for example the make dictproxy object via ctypes.pythonapi and type() (Python recipe 576540)  by Ikkei Shimomura. The recipe contains a test checking that a dictproxy is "mutable" (modify the dictionary linked to the dictproxy).
However dictproxy can be useful in some cases, where its mutable property is not an issue, to avoid a copy of the dictionary.
- Implementing an Immutable Dictionary (Python recipe 498072) by Aristotelis Mikropoulos. Similar to frozendict except that it is not truly read-only: it is possible to access to this private internal dict. It does not implement __hash__ and has an implementation issue: it is possible to call again __init__() to modify the mapping.
- PyWebmail contains an ImmutableDict type: webmail.utils.ImmutableDict. It is hashable if keys and values are hashable. It is not truly read-only: its internal dict is a public attribute.
- remember project: remember.dicts.FrozenDict. It is used to implement a cache: FrozenDict is used to store function callbacks. FrozenDict may be hashable. It has an extra supply_dict() class method to create a FrozenDict from a dict without copying the dict: store the dict as the internal dict. Implementation issue: __init__() can be called to modify the mapping and the hash may differ depending on item creation order. The mapping is not truly read-only: the internal dict is accessible in Python.
Blacklist approach: inherit from dict and override write methods to raise an exception. It is not truly read-only: it is still possible to call dict methods on such "frozen dictionary" to modify it.
- brownie: brownie.datastructures.ImmuatableDict. It is hashable if keys and values are hashable. werkzeug project has the same code: werkzeug.datastructures.ImmutableDict. ImmutableDict is used for global constant (configuration options). The Flask project uses ImmutableDict of werkzeug for its default configuration.
- SQLAchemy project: sqlachemy.util.immutabledict. It is not hashable and has an extra method: union(). immutabledict is used for the default value of parameter of some functions expecting a mapping. Example: mapper_args=immutabledict() in SqlSoup.map().
- Frozen dictionaries (Python recipe 414283) by Oren Tirosh. It is hashable if keys and values are hashable. Included in the following projects:
- The gsakkis-utils project written by George Sakkis includes a frozendict type: datastructs.frozendict
- characters: scripts/python/frozendict.py. It is hashable. __init__() sets __init__ to None.
- Old NLTK (1.x): nltk.util.frozendict. Keys and values must be hashable. __init__() can be called twice to modify the mapping. frozendict is used to "freeze" an object.
Hashable dict: inherit from dict and just add an __hash__ method.
This document has been placed in the public domain.