sorteddict PEP proposal [started off as orderedict]
Steven Bethard
steven.bethard at gmail.com
Tue Sep 25 14:58:36 EDT 2007
Paul Hankin wrote:
> On Sep 25, 12:51 pm, Mark Summerfield <m.n.summerfi... at googlemail.com>
> wrote:
>> On 25 Sep, 12:19, Paul Hankin <paul.han... at gmail.com> wrote:
>>
>>
>>
>>> Recall sorted...
>>> sorted(iterable, cmp=None, key=None, reverse=False) --> new sorted
>>> list
>>> So why not construct sorteddicts using the same idea of sortedness?
>>> sorteddict((mapping | sequence | nothing), cmp=None, key=None,
>>> reverse=None)
>>> Then we can specify the exact behaviour of sorteddicts: it's the same
>>> as a regular dict, except when the order of keys is important they're
>>> ordered as if they were sorted by sorted(keys, cmp=sd._cmp,
>>> key=sd._key, reverse=sd._reverse). Here I imagine cmp, key and reverse
>>> are stored in the new sorteddict as attributes _cmp, _key, _reverse -
>>> obviously the actual implementation may differ.
>>> This has the benefit of making sorteddict's behaviour explicit and
>>> easy to understand, and doesn't introduce a new sorting API when we
>>> already have a perfectly decent one.
>>> The only problem here is the **kwargs form of the dict constructor
>>> doesn't translate as we're using keyword args to pass in the sort
>>> criterion. Perhaps someone has an idea of how this can be solved. If
>>> nothing else, this constructor could be dropped for sorteddicts, and
>>> sorteddict(dict(**kwargs), cmp=..., key=..., reverse=...) when that
>>> behaviour is wanted.
>>> I don't like the integral indexing idea or the slicing: indexing and
>>> slicing are already part of python and it's bad to have different
>>> syntax for the same concept on sorteddicts. I'd say it's not an
>>> important enough for sorteddicts anyway.
>>> --
>>> Paul Hankin
>> This makes a lot of sense to me. But how do you envisage it would be
>> implemented?
>
> Here's a first go. Sorting occurs when the keys are iterated over,
> making it fast (almost as a dict) for construction, insertion, and
> deletion, but slow if you're iterating a lot. You should look at some
> use cases to decide if this approach is best, or if a sorted
> datastructure should be used instead, but my instinct is that this is
> a decent approach. Certainly, you're unlikely to get a simpler
> implementation :)
>
> class sorteddict(dict):
> "A sorted dictionary"
> def __init__(self, arg=None, cmp=None, key=None, reverse=False):
> if arg:
> super(sorteddict, self).__init__(arg)
> else:
> super(sorteddict, self).__init__()
> self._cmp = cmp
> self._key = key
> self._reverse = reverse
> def keys(self):
> return sorted(super(sorteddict, self).keys(), cmp=self._cmp,
> key=self._key, reverse=self._reverse)
> def iter_keys(self):
> return (s for s in self.keys())
> def items(self):
> return [(key, self[key]) for key in self.keys()]
> def iter_items(self):
> return ((key, self[key]) for key in self.keys())
> def values(self):
> return [self[key] for key in self.keys()]
> def iter_values(self):
> return (self[key] for key in self.keys())
> def __str__(self):
> return '{' + ', '.join('%s: %s' % (repr(k), repr(v))
> for k, v in self.iter_items()) + '}'
> def __repr__(self):
> return str(self)
> def __iter__(self):
> return self.iter_keys()
With this is the implementation, I'm definitely -1. Not because it's a
bad implementation, but because if the iteration is always doing a sort,
then there's no reason for a separate data structure. Just use sorted()
directly on a regular dict. That has the advantage of being much more
obvious about where the expensive operations are::
for key in sorted(d, ...):
... whatever you want to do ...
IMHO, the only reason to introduce a sorteddict() is if it minimizes the
cost of sorting the keys.
STeVe
More information about the Python-list
mailing list