set and dict iteration

Aaron Brady castironpi at gmail.com
Mon Aug 27 14:10:07 EDT 2012


On Thursday, August 23, 2012 1:11:14 PM UTC-5, Steven D'Aprano wrote:
> On Thu, 23 Aug 2012 09:49:41 -0700, Aaron Brady wrote:
> 
> 
> 
> [...]
> 
> > The patch for the above is only 40-60 lines.  However it introduces two
> 
> > new concepts.
> 
> > 
> 
> > The first is a "linked list", a classic dynamic data structure, first
> 
> > developed in 1955, cf. http://en.wikipedia.org/wiki/Linked_list . 
> 
> > Linked lists are absent in Python
> 
> 
> 
> They certainly are not. There's merely no named "linked list" class.
> 
> 
> 
> Linked lists are used by collections.ChainMap, tracebacks, xml.dom, 
> 
> Abstract Syntax Trees, and probably many other places. (Well, technically 
> 
> some of these are trees rather than lists.) You can trivially create a 
> 
> linked list:
> 
> 
> 
> x = [a, [b, [c, [d, [e, None]]]]]
> 
> 
> 
> is equivalent to a singly-linked list with five nodes. Only less 
> 
> efficient.
> 
> 
> 
> 
> 
> > The second is "uncounted references".  The uncounted references are
> 
> > references to "set iterators" exclusively, exist only internally to
> 
> > "set" objects, and are invisible to the rest of the program.  The reason
> 
> > for the exception is that iterators are unique in the Python Data Model;
> 
> > iterators consist of a single immutable reference, unlike both immutable
> 
> > types such as strings and numbers, as well as container types.  Counted
> 
> > references could be used instead, but would be consistently wasted work
> 
> > for the garbage collector, though the benefit to programmers' peace of
> 
> > mind could be significant.
> 
> 
> 
> The usual way to implement "uncounted references" is by using weakrefs. 
> 
> Why invent yet another form of weakref?
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven


Hello S. D'Aprano.  Thanks for your support as always.

The semantics of the second collection are equivalent to a WeakSet.  The space and time consumption of a WeakSet are higher in comparison to a linked list.  However, so long as we iterated over it using the C API instead of creating an iterator, it would be consistent.  If we dynamically create the WeakSet on demand and free it when empty, the space consumption would be lower.  Typical use cases don't involve creating thousands of iterators, or rapidly creating and destroying them, so the performance impact might not be severe.

Regarding the bare weakrefs, if the iterator's destructor hasn't been called, then the pointer is still valid.  If it has been called, then it's not present in the list.  Unlike Python classes, the destructors of C extension classes are guaranteed to be called.  Therefore there are no points during exection at which a node needs to check whether a reference to its neighbor is valid.

Are your concerns based on data integrity, future maintainability, or what?



More information about the Python-list mailing list