[Tutor] dictionaries and memory handling

Fri Feb 23 21:08:07 CET 2007

On Fri, Feb 23, 2007, Alan Gauld wrote:
>"Bill Campbell" <bill at celestial.net> wrote
>
>>>It seems that an SQL database would probably be the way to go, but I
>>>am a bit concerned about speed issues (even though running time is
>> ...
>> You would probably be better off using one of the hash databases,
>> Berkeley, gdbm, etc. (see the anydbm documentation).  These can
>> be treated exactly like dictionaries in python, and are probably
>> orders of magnitude faster than using an SQL database.
>
>I'm glad Bill suggested this because I'd forgotten about them 
>entirely!
>But while they wont literally be "orders of magnitude" faster - the
>disk I/O subsystem is usually the main limiter here -  they will be
>several factors faster, in fact many SQL databases use the dbm
>database under the hood.

While the disk subsystem is going to be a factor, the overhead
communicating with the SQL server, parsing the queries, etc. will be far
greater than calculating location of the record using the hashed key.

FWIW: I've found that the size of Berkeley DB btree files can be
significantly less than the Berkeley hash files.

I would really like to see somebody come up with a good alternative to the
Berkeley DB stuff from sleepcat.  The source code is the most godawfull
mess if #ifn*defs I've ever seen, with frequent API even in minor release
levels.  Take a look at the bdb source in python or perl if you want to see
what I'm talking about.

Bill
--
INTERNET:   bill at Celestial.COM  Bill Campbell; Celestial Software LLC
URL: http://www.celestial.com/  PO Box 820; 6641 E. Mercer Way
FAX:            (206) 232-9186  Mercer Island, WA 98040-0820; (206) 236-1676

``the purpose of government is to reign in the rights of the people''
    -Bill Clinton during an interview on MTV in 1993