Builtin classes list, set, dict reimplemented via B-trees

Tim Peters tim.peters at gmail.com
Wed Sep 14 15:43:49 EDT 2005


[barnesc at engr.orst.edu]
> ...
> I've gotten bored and went back to one of my other projects:
> reimplementing the Python builtin classes list(), set(), dict(),
> and frozenset() with balanced trees (specifically, counted B-trees
> stored in memory).
>
> In short, this allows list lookup, insertion, deletion in O(log(N))
> time.  It allows the set and dictionary types to be maintained in
> order, and insert/lookup/remove operations here take O(log(N)) time
> as well.  Getting the k least or k greatest elements takes
> O(log(N)+k) time.

Note that BTrees for Python have been part of ZODB for many years:

    Section 5.3, _BTrees Package_
    http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html

If you install ZODB, you can use its BTrees as in-memory data
structures without using any of the rest of ZODB.  If you want to
persist them, great, that's what ZODB is for.

Note that ZODB's are really B+ trees, so iterating over the smallest k
takes O(k) time.  As an extension to Python's mapping protocol, the
keys/values/items/iterkeys/itervalues/iteritems methods also accept
optional lower and upper bounds on the keys to return.

A gotcha:  For scalability in multiprocess database apps, ZODB's
BTrees do not store their size.  As a result, len(a_ZODB_BTree) takes
time linear in the number of elements.

> ...
> So my question is: are there any other *practical* applications of a
> B-tree based list/set/dict ?

Yes.

> In other words, is this module totally worth coding,

That's a different question entirely ;-)

> or is it just academic wankery and theoretical flim flam ? :)

It's probably not a way to get rich quick.



More information about the Python-list mailing list