Data/File Structure and Algorithm for Retrieving Sorted Data Chunk Efficiently

Alex Martelli aleaxit at yahoo.com
Fri Oct 8 05:43:34 EDT 2004


Josiah Carlson <jcarlson at uci.edu> wrote:

> > > Another problem is, the list update action could take a long time and
> > > big memory. The list could shrink and expand. Do I have to sort them
> > > all in memory and then update the textfile _everytime_ it's changed?
> > 
> > I don't understand this thread.  If you're trying to implement a real
> > system, why don't you use MySQL?  You're at nowhere near the traffic
> > level where that approach runs out of gas.
> 
> Or even the included bsddb.btree database (remember to translate to/from
> strings, open using bsddb.btopen() ).

But, given a bsdsb btree with, say, 10,000 entries in it (in virtually
sorted order), how does one speedily say "get entries 4020 to 4030", for
example?  That was the original poster's problem as originally posed.
If you call somedb.set_location(4020), you learn that integer keys are
only allowed for Recno and Queue DBs, not for BTree ones...

I think the crucial step (with current versions of bsddb) is to have
btflags=bsddb.db.DB_RECNUM at creation time; then, say xx is the
variable that's your database, xx.dbc.set_recno(4019) should position
the cursor as required.  Sleepycat does warn that DB_RECNUM can degrade
performance for some applications, though, so be sure to do some
benchmarks if you want to do things this way.


Alex



More information about the Python-list mailing list