New Ordered Dictionery to Criticise

Fri Dec 2 09:03:43 EST 2005

Hello Bengt,

Bengt Richter wrote:
> On 1 Dec 2005 03:38:37 -0800, "Fuzzyman" <fuzzyman at gmail.com> wrote:
>
> >
> >Fuzzyman wrote:
> >> Sorry for this hurried message - I've done a new implementation of out
> >> ordered dict. This comes out of the discussion on this newsgroup (see
> >> blog entry for link to archive of discussion).
> >>
> >> See the latest blog entry to get at it :
> >> http://www.voidspace.org.uk/python/weblog/index.shtml
> >>
> >
> >Hello all,
> >
> >I've just done a new "beta 2" version. It has a full version of
> >FancyODict with the custome "callable sequence objects" for keys,
> >values and items. They are almost completely covered by tests.
> >
> >You can download the new(er) version from :
> >http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=odictbeta2.py
> >
> Ran my tests on it: this one may be of interest:
>  __________________________ entrypoint: test_popitem ___________________________
>
>      def test_popitem():
>          d = CandidateDict([(k,i*100) for i,k in enumerate('abcde')])
>  >       assert d.popitem() == ('e', 400)
>
>  [C:\pywk\ut\test\test_odicts.py:228]
>  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>  ...
>  ...
>  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
>      def __getattr__(self, name):
>          """
>          Implemented so that access to ``sequence`` raises a warning.
>
>          >>> d = OrderedDict()
>          >>> d.sequence
>          []
>          """
>          if name == 'sequence':
>              warn('use of sequence attribute is deprecated. Use keys method '
>                  'instead.', DeprecationWarning)
>              # NOTE: Still (currently) returns a direct reference. Need to
>              #   because code that uses sequence will expect to be able to
>              #   mutate it in place.
>              return self._sequence
>          else:
>              # NOTE: strictly unnecessary, but it raises the appropriate error
>  >           getattr(self, name)
>
>  [c:\pywk\clp\odictbeta2.py:309]
>  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>  Recursion detected (same locals & position)
>  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
>  ============= tests finished: 19 passed, 9 failed in 1.57 seconds =============
>

Yep the line :

getattr(self, name)

is a bug.

I've replaced it now (not yet "published") with
object.__getattr__(self, name)
It could equally just be replaced with a raise AttributeError

You've picked up on another typo in the code, whichis odd because
``popitem`` is tested.

Oh well - works now.

[snip..]
> >    You can currently alter the order (of keys, values *and* items) by
> >passing an iterable into those methods.
> I'm playing this game, but I wonder how much of it really makes sense ;-)
>

Can you explain what you mean ?

> >
> >    Someone has suggested that this "smells bad" - and it ought to be
> >done through separate `setkeys``, ``setvalues``, and ``setitems``
> >methods.
> >
> >    I'm *inclined* to agree, but I don't feel strongly about it. Anyone
> >else got any opinions ?
> I don't see a big difference. What's important, if you ask me, is
> whether you get an idea of what will happen when you pass the iterable as
> an argument. OTTOMH that suggests that keys/values/items[slice] = iterable
> have more expectation-generating precedent, and in fact perhaps should define
> what is to happen if no other constraints are violated.

Don't completely follow.

Are you saying that

d = OrderedDict(some_iterable)
d.keys[slice] = iterable

is more *understandable*  ("expectation-generating precedent" WTF) ?

The alternative being : d.keys(iterable)

In which case I'm inclined to agree. *However* - having custom objects
for items/keys/values is undoubtably less efficient.

Therefore I'm supplying two classes OrderedDict and FancyODict (name
not yet fixed).

OrderedDict needs a way of changing the keys - so it needs either

d.keys(iterable)

*or*

d.setkeys(iterable)

Which is preferable, is my question.

I think you're saying you don't care :-)
In which case I will leave it as it presently is.

> >
> >* ``repr`` ought to return a value that ``eval`` could use to turn back
> >into an OrderedDict.
> >
> >    I have actually done an implementation of this, but it would mean
> >that *all* the doctests need to be changed. I *will* do this at some
> >point.
> Point in favor of py.test's separation of test/tested I guess.
>

Probably. :-)

doctest is part of the standard library - which is a point in it's
favour.

> >
> >* Slice assignment.
> >
> >    The semantics for slice assignment are fiddly.
> >
> >    For example, what do you do if in a slice assignment a key collides
> >with an existing key ?
> IMO, a slice defines the sliced-out as well as the untouched, and I think
> it is an error to collide with the unouched part, since it requires either
> modifying the "untouched" or ignoring incoming slice data.
> Incoming can go by the rules of a plain dict constuctor, except preserving order,
> and the rest should really be untouched IMO. BTW, I think new keys in the
> incoming slice data should NOT wind up at the end of the overall dict, since
> that violates the "untouched" part.
>

"new keys in the incoming slice data should NOT wind up at the end of
the overall dict"

I'm not a 100% certain I understand you, and I possibly didn't explain
myself properly.

If you assign to a slice, and there is a key collision. The old
(pre-existing) key is removed and the new inserted at the expected
index - so it doesn't end up at the "end".

> >
> >    My current implementation does what an ordinary dictionary does,
> >the new value overwrites the previous one. This means that the
> >dictionary can reduce in size as the assignment progresses. {sm;:?}
> That part is ok, though I wonder if some might want to have a strict-replacement
> mode for an odict, so slices can't change sizes.
> BTW  just thought of a name: itemdict -- to suggest the item-list-ordered semantics.
> >
> >    I think this is preferable to raising an error and preventing
> >assignment. It does prevent an optimisation whereby I calculate the
> >indexes of all the new items in advance.
> a strict_replacement=True kwyord arg to the descriptor could let you
> do both.
>

Then I have to implement *both* ways of doing it ! I'll consider it.

> >
> >    It also means you can't rely on the index of a key from a slice
> >assignment, unless you know that there will be no key collisions.
> Well, if you guarantee no name collisions in the source items by using
> an ordered dict as source, and make collisions with the "untouched"
> parts of a slice assignment an error, then you can rely on it.
> IMO that would be a useful middle way.
>

I'll consider that as well. As I said I don't like raising errors when
I don't have to. If the programmer wants to guarantee there are no
collisions then he can check himself. :-)

> >
> >     In general I'm *against* preventing programmers from doing things,
> >so long as the documentation carries an appropriate warning.
> Me too, in general ;-)
[snip..]
> >    # we can assign to every other key
> >    # using an ordered dict
> Why the restriction to ordered dict, other than as a convenient
> way to pre-normalize the item list? Doesn't update accept a plain item list?
>
Allow slice assignment of a list of tuples like update... hmmm...

Maybe if anyone asks for it "in the wild".

> >    d[::2] = OrderedDict([(2, 9), (4, 8)])
> i.e., why not accept plain [(2, 9), (4, 8)] here?
> >    len(d) == 4
> >    False
> BTW, this one I don't do yet. I reject aslice.step not in (None, 1)
> I seemed a bit too weird. But I guess I could just let the programmer decide ;-)

I didn't see any reason to restrict it, seeing as for my (unoptimised)
implementation it doesn't make any difference.

> >
> >    d
> >    {2: 9, 4: 8}
> >
> >    """
> >    Because of the key collisions the length of
> >    d has changed - it now only has two keys instead
> >    of four.
> >    """
> IMO the above collision is an error that should raise an exception,
> because the collision is with a key outside the (albeit interlaced)
> target slice, so you are effectively assigning outside the slice.
>

Yeah - maybe. Still not sure about this.

> BTW, what about slices of keys and values? E.g. does
>
>     d.keys[:] = ['prefix_'+ key for key in d.keys()]
>
> make sense? Note that it is different from d[:] = something.
>

Yep - it allows you to change the keys in place. (Retaining the
position in the sequence)

> Similarly d.values, thoug d.items[sliceinst] might be the same
> as d[sliceinst] for assignment. d[sliceinst] returns an ordered dict though,
> where as d.items[sliceinst] returns the same item list slice as
> d.items()[sliceinst].
>

Yep. Lots to consider. Only a few minor tweaks and my implementation is
basically working enough to release.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

> Regards,
> Bengt Richter