New Ordered Dictionery to Criticise

Bengt Richter bokr at oz.net
Fri Dec 2 02:06:47 EST 2005


On 1 Dec 2005 03:38:37 -0800, "Fuzzyman" <fuzzyman at gmail.com> wrote:

>
>Fuzzyman wrote:
>> Sorry for this hurried message - I've done a new implementation of out
>> ordered dict. This comes out of the discussion on this newsgroup (see
>> blog entry for link to archive of discussion).
>>
>> See the latest blog entry to get at it :
>> http://www.voidspace.org.uk/python/weblog/index.shtml
>>
>
>Hello all,
>
>I've just done a new "beta 2" version. It has a full version of
>FancyODict with the custome "callable sequence objects" for keys,
>values and items. They are almost completely covered by tests.
>
>You can download the new(er) version from :
>http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py
>
Ran my tests on it: this one may be of interest:
 __________________________ entrypoint: test_popitem ___________________________

     def test_popitem():
         d = CandidateDict([(k,i*100) for i,k in enumerate('abcde')])
 >       assert d.popitem() == ('e', 400)

 [C:\pywk\ut\test\test_odicts.py:228]
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 ...
 ...
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

     def __getattr__(self, name):
         """
         Implemented so that access to ``sequence`` raises a warning.  

         >>> d = OrderedDict()
         >>> d.sequence
         []
         """
         if name == 'sequence':
             warn('use of sequence attribute is deprecated. Use keys method '
                 'instead.', DeprecationWarning)
             # NOTE: Still (currently) returns a direct reference. Need to
             #   because code that uses sequence will expect to be able to
             #   mutate it in place.
             return self._sequence
         else:
             # NOTE: strictly unnecessary, but it raises the appropriate error
 >           getattr(self, name)

 [c:\pywk\clp\odictbeta2.py:309]
 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
 Recursion detected (same locals & position)
 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 ============= tests finished: 19 passed, 9 failed in 1.57 seconds =============

>Full discussion of the remaining issues below, or at :
>http://www.voidspace.org.uk/python/weblog/arch_d7_2005_11_26.shtml#e147
>
>Progress on the updated implementation of dict continues. (I hestitate
>to say *new* version, as it's just a heavy makeover for the old code -
>which was basically sound).
>
>``FancyODict`` is now a full implementation of an Ordered Dictionary,
>with custom *callable sequence objects* for ``keys``, ``values``, and
>``items``. These can be called like normal methods, but can also be
>accessed directly as sequence objects. This includes assigning to,
>indexing, and slicing - as well as all the other relevant sequence
>methods. {sm;:-)}
>
>I've also added an optional index to ``OrderedDict.popitem``.
>
>I'm sure there are lots of ways this can be optimised for efficiency -
>but the new objects have got pretty full test coverage.
>
>You can download the new version (for testing) from `odict Beta 2
><http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py>`_
>
>The following issues still remain :
>
>* ``FancyOdict`` is a separate class from ``OrderedDict``.
>
>    Because this version is *undoubtably* less efficient than
>OrderedDict, my current thinking is that I should leave them separate
>(and have both available). Being able to operate on the
>keys/values/items as sequences is for convenience only.
>
>    Anyone got a suggestion for a better name than ``FancyODict`` ?
>
>* You can no longer access the key order directly. The old ``sequence``
>attribute is depracated and will eventually go away.
>
>    You can currently alter the order (of keys, values *and* items) by
>passing an iterable into those methods.
I'm playing this game, but I wonder how much of it really makes sense ;-)

>
>    Someone has suggested that this "smells bad" - and it ought to be
>done through separate `setkeys``, ``setvalues``, and ``setitems``
>methods.
>
>    I'm *inclined* to agree, but I don't feel strongly about it. Anyone
>else got any opinions ?
I don't see a big difference. What's important, if you ask me, is
whether you get an idea of what will happen when you pass the iterable as
an argument. OTTOMH that suggests that keys/values/items[slice] = iterable
have more expectation-generating precedent, and in fact perhaps should define
what is to happen if no other constraints are violated.
>
>* ``repr`` ought to return a value that ``eval`` could use to turn back
>into an OrderedDict.
>
>    I have actually done an implementation of this, but it would mean
>that *all* the doctests need to be changed. I *will* do this at some
>point.
Point in favor of py.test's separation of test/tested I guess.

>
>* Slice assignment.
>
>    The semantics for slice assignment are fiddly.
>
>    For example, what do you do if in a slice assignment a key collides
>with an existing key ?
IMO, a slice defines the sliced-out as well as the untouched, and I think
it is an error to collide with the unouched part, since it requires either
modifying the "untouched" or ignoring incoming slice data.
Incoming can go by the rules of a plain dict constuctor, except preserving order,
and the rest should really be untouched IMO. BTW, I think new keys in the
incoming slice data should NOT wind up at the end of the overall dict, since
that violates the "untouched" part.

>
>    My current implementation does what an ordinary dictionary does,
>the new value overwrites the previous one. This means that the
>dictionary can reduce in size as the assignment progresses. {sm;:?}
That part is ok, though I wonder if some might want to have a strict-replacement
mode for an odict, so slices can't change sizes.
BTW  just thought of a name: itemdict -- to suggest the item-list-ordered semantics.
>
>    I think this is preferable to raising an error and preventing
>assignment. It does prevent an optimisation whereby I calculate the
>indexes of all the new items in advance.
a strict_replacement=True kwyord arg to the descriptor could let you
do both.

>
>    It also means you can't rely on the index of a key from a slice
>assignment, unless you know that there will be no key collisions.
Well, if you guarantee no name collisions in the source items by using
an ordered dict as source, and make collisions with the "untouched"
parts of a slice assignment an error, then you can rely on it.
IMO that would be a useful middle way.

>
>     In general I'm *against* preventing programmers from doing things,
>so long as the documentation carries an appropriate warning.
Me too, in general ;-)
>
>    An example will probably help illustrate this :
>
>.. raw:: html
>
>    {+coloring}
>
>    d = OrderedDict()
>    d[1] = 1
>    d[2] = 2
>    d[3] = 3
>    d[4] = 4
>    d.keys()
>    [1, 2, 3]
Where's the 4?
(I.e., why not just take your examples off an interactive screen? ;-)

>
>    # fetching every other key
>    # using an extended slice
>    # this actually returns an OrderedDict
>    d[::2]
>    {1: 1, 3: 3}
     OrderedDict([(1, 1), (3, 3)])  # in future, right?
(i.e. '%s(%s)' %( type(d).__name__. d.items()) # whatever the final name ;-)

>
>    # we can assign to every other key
>    # using an ordered dict
Why the restriction to ordered dict, other than as a convenient
way to pre-normalize the item list? Doesn't update accept a plain item list?

>    d[::2] = OrderedDict([(2, 9), (4, 8)])
i.e., why not accept plain [(2, 9), (4, 8)] here?
>    len(d) == 4
>    False
BTW, this one I don't do yet. I reject aslice.step not in (None, 1)
I seemed a bit too weird. But I guess I could just let the programmer decide ;-)
>
>    d
>    {2: 9, 4: 8}
>
>    """
>    Because of the key collisions the length of
>    d has changed - it now only has two keys instead
>    of four.
>    """
IMO the above collision is an error that should raise an exception,
because the collision is with a key outside the (albeit interlaced)
target slice, so you are effectively assigning outside the slice.

BTW, what about slices of keys and values? E.g. does

    d.keys[:] = ['prefix_'+ key for key in d.keys()]

make sense? Note that it is different from d[:] = something.

Similarly d.values, thoug d.items[sliceinst] might be the same
as d[sliceinst] for assignment. d[sliceinst] returns an ordered dict though,
where as d.items[sliceinst] returns the same item list slice as
d.items()[sliceinst].

Regards,
Bengt Richter



More information about the Python-list mailing list