Why are there no ordered dictionaries?

Sat Nov 26 22:46:36 EST 2005

On Thu, 24 Nov 2005 18:42:45 +0100, Christoph Zwerschke <cito at online.de> wrote:

>Bengt Richter schrieb:
>
>>>d.setvalues((13, 14)) ==> d = OrderedDict((1, 13), (2, 14))
>
>> The implication above is that OrderedDict takes an *args argument,
>> but really it takes a single argument that is a sequence of k,v pairs,
>> (and maybe some keyword options).
>
>Right. Interpret it as a short notation only, I did not want to change 
>the interface. I think it's a common mistake to forget the brackets 
>because you think one pair should be enough. At least I often forget it.
Ok, I forget this too.
>
>> You could make keys, values, and items custom descriptors which would define __call__
>> for the old key() etc accesses, well as __getitem__ and __setitem__. Then you could write
>> 
>>     d.items[0], d.items[-1] = d.items[-1], d.items[0]
>> 
>> to swap the order of the first and last items in the thing (I hesitate to say dict ;-)
>> You could also operate on slices.
>
>Nice. You could also get the i-th value with d.values[i].
>
>But is this considered good style or would it be considered "dirty" to 
>have a callable member that also supports indexing and slicing? (I don't 
>know, just asking?) Plus, it opens a can of worms by increasing the 
>complexity tremendously. Let's see whether this can be handled.
I suspect not that complex, but it has the same performance disadvantage as
properties.

>
>> BTW, before I showed an example where d[2:3] returned
>> a new dict instance rather than d.items()[:]. I think the items list
> > is better, since they naturally add, sort, reverse, etc as lists,
> > and you can write OrderedDict(d[2:3]+d[1:2]) if you want a new dict.
>
>Not sure about that. I would rather expect that if you slice an object, 
>you get an object of the same type. And you can also add, sort, reverse, 
>etc the ordered dict if you want.
>
>> A slice assignment to d[i:j] is really sugar for something, but we have to decide exactly
>> what in case of duplicate keys in the incoming items, either with each other ...
>
>You mean slice assignments to d.items[i:j]. If you make the slice 
>assignment to d[i:j] then at least you cannot have conflicts in the 
>incoming items. The first question is: Should a slice assignment be 
>treated as deletion with subsequential addition (changing the order) or 
>as a replacement (i.e. try to not change the order)? I agree that the 
>second would be the expected semantics, but as you mentioned more 
>difficult to implement.
>
> > One way to define it would be
> >     d.items[i:j] = itemseq
>> 
>> to be implemented as sugar for
>>     newitems = d.items[:i] + list(itemseq) + d.items[j:]
>>     d.clear()
>>     d.update(newitems)
>
>Which should be the same as
>d = OrderedDict(d.items[:i] + list(itemseq) + d.items[j:])
>Sounds reasonable.
Actually it's not the same, which is why I wrote it with update.
Analogous to slice assignment in lists, the referred-to object
gets mutated. Hence preexisting references to the object see the change too.

If you just rebind d with a new OrderedDict object, previous bindings
are not affected. I.e.,
       d = OrderedDict(sorted((k,i)) for i,k in enumerate('abc'))
       e = d
       d = OrderedDict(d.items[:1] + [('b', 100)] + d.items[2:])
       d.items()[1] # => ('b', 100)
       e.items()[1] # => ('b', 1)   # unaffected

>
>> So
>>     d.reverse()
>> could be spelled
>>     d.items[:] = d.items[::-1]
>
>Slice assignment for keys and values would also allow to set a new key 
>order with
>
>d.keys[:] = newkeyseq
Do you really mean just re-ordering the keys without a corresponding reording of values??
That would be a weird renaming of all values. Or do you means that any key should still
retrieve the same value as before if used as d[key]? In which case the values must undergo
the same permutation as the keys. I.e., you are assuming key->value pairings remain stable
through any key reorderings?

>
>So no need to define a setkeys() method or let keys() take arguments.
>If we want to allow this kind of things at all.
>
>> If you are using slices, you can safely use them directly in the __getitem__ of th dict interface,
>> as I did in the "Que mas?" post. So we don't have to write d.items[i:j] and could write d[i:j].
>> The thing is, d[i:j] will tempt to skip ".items" in d.items[i] and write d[i], which has the dict
> > key meaning, not the item list index. It is faster not have a 
>descriptor between though.
>
>I still think d[i:j] should return an ordered dict, not an item list.
Easy to do either way.

>
>> I think maybe allowing write to keys or values is pretty iffy. There are too many weird
>> re-associations of keys and values possible, and which you could do my other means if you
>> really really wanted to. But I do think full index and slice read access would be fine.
>
>There are different opinions. Fuzzyman would probably say "Don't trust 
>yourself?" I myself am undecided. Perhaps you could expect that somebody 
>knows what he does if he makes a slice assignment to d.keys. In any way, 
>such an assignment should not "break" the directory (with "broken" I 
>mean that the internal keys sequence does not match the keys of the 
>internal dictionary). If anything does not match, it should raise a 
>KeyException. If it is implemented that way, I think we can allow it.
Exactly what, though? should e.g.
    d.keys[3] =  newk3
mean (not a suggested implementation, just to define semantics)
    keys = d.keys()
    if newk3 in keys and keys.index(newk3)!=3:
        raise ValueError,'Attempt to introduce duplicate key'
    items = d.items()
    items[3] = (newk3, items[3][1])
    d.clear()
    d.update(items)
?
This would allow what you might call renaming in place.
Similarly
    d.keys[i:j] = newkeysij
might have the semantics of
    keys = d.keys()
    outside = set(keys[:i])+set(keys[j:])
    if outside & set(newkeysij) or len(newkeysij) != len(set(newkeysij)):
        raise ValueError,'Attempt to introduce duplicate key(s)'
    items = d.items()
    items[i:j] = [(k, items[kx+i][1]) for kx,k in enumerate(newkeysij)]
    d.clear()
    d.update(items)

Is this what is desired?

>
>> I think also that d1==d2 should effectively be implemented as d1[:] == d2[:] -- i.e, compare
>> the item lists to implement comparisons.
>
>Wouldn't it be more performant to compare for 
>d1.internal_dict==d2.internal_dict and 
>d1.internal_sequence==d2.internal_sequence?
>You don't keep track of the item lists, they need to be built on every 
>occasion.
That depends on how you implement ;-)

Back from holiday, so maybe I'll hack something out.

Regards,
Bengt Richter