Builtin classes list, set, dict reimplemented via B-trees
Tim Peters
tim.peters at gmail.com
Wed Sep 14 15:43:49 EDT 2005
[barnesc at engr.orst.edu]
> ...
> I've gotten bored and went back to one of my other projects:
> reimplementing the Python builtin classes list(), set(), dict(),
> and frozenset() with balanced trees (specifically, counted B-trees
> stored in memory).
>
> In short, this allows list lookup, insertion, deletion in O(log(N))
> time. It allows the set and dictionary types to be maintained in
> order, and insert/lookup/remove operations here take O(log(N)) time
> as well. Getting the k least or k greatest elements takes
> O(log(N)+k) time.
Note that BTrees for Python have been part of ZODB for many years:
Section 5.3, _BTrees Package_
http://www.zope.org/Wikis/ZODB/FrontPage/guide/node6.html
If you install ZODB, you can use its BTrees as in-memory data
structures without using any of the rest of ZODB. If you want to
persist them, great, that's what ZODB is for.
Note that ZODB's are really B+ trees, so iterating over the smallest k
takes O(k) time. As an extension to Python's mapping protocol, the
keys/values/items/iterkeys/itervalues/iteritems methods also accept
optional lower and upper bounds on the keys to return.
A gotcha: For scalability in multiprocess database apps, ZODB's
BTrees do not store their size. As a result, len(a_ZODB_BTree) takes
time linear in the number of elements.
> ...
> So my question is: are there any other *practical* applications of a
> B-tree based list/set/dict ?
Yes.
> In other words, is this module totally worth coding,
That's a different question entirely ;-)
> or is it just academic wankery and theoretical flim flam ? :)
It's probably not a way to get rich quick.
More information about the Python-list
mailing list