BSDDB problem

Skip Montanaro skip at mojam.com
Thu Jan 11 15:05:08 EST 2001


    MJF> I am using Python 2.0 on a Linux machine. I wanted to use the bsddb
    MJF> module to get the information out the Netscape History.dat file. I
    MJF> tried the following:

    MJF> import bsddb
    MJF> hist = bsddb.hashopen('history.dat', 'r')

    MJF> I got the error:

    MJF> Traceback (most recent call last):
    MJF>   File "<stdin>", line 1, in ?
    MJF> bsddb.error: (22, 'Invalid argument')

You've been bitten by Berkeley DB's version skew.  As far as I know, each
time Sleepycat has upgraded they've changed the file format.  Netscape 4.75
statically linked Berkeley DB 1.85.  You are almost certainly using 2.x or
possibly even 3.x on your Debian machine.  Try executing "file spam.db" to
see what you get and comparing that with "file history.dat".  Here's what I
see:

    % file history.dat
    history.dat: Berkeley DB 1.85 Hash/Little Endian (Version 2, Bucket Size 4096, Bucket Shift 12, Directory Size 256, Segment Size 256, Segment Shift 8, Overflow Point 10, Last Freed 10, Max Bucket 513, High Mask 0x3ff, Low Mask 0x1ff, Fill Factor 48, Number of Keys 14938)
    % file spam.db 
    spam.db: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 2, Last Freed 0, Max Bucket 3, High Mask 0x3, Low Mask 0x1, Fill Factor 0, Number of Keys 0)

You can probably build Python 2.x against Berkeley DB 1.85 if you're
desperate to read that file.  Look around your lib directories for something
like libdb1.a.  Be careful with the include files, however.  You don't want
to #include the 2.x include files and link with the 1.x library.

A perhaps better alternative would be to use db_dump185 to dump the
history.dat file to plain text and then db_load to create a history2.dat
file from that:

    % db_dump185 ~/.netscape/history.dat > history2.txt
    % db_load history2.dat < history2.txt
    % file history2.dat
    history2.dat: Berkeley DB 2.X Hash/Little Endian (Version 5, Logical sequence number: file - 0, offset - 0, Bucket Size 4096, Overflow Point 9, Last Freed 596, Max Bucket 422, High Mask 0x1ff, Low Mask 0xff, Fill Factor 48, Number of Keys 15011)
    % python
    Python 2.0 (#19, Jan 10 2001, 22:34:25) 
    [GCC 2.95.3 19991030 (prerelease)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> import bsddb
    >>> db = bsddb.hashopen("history2.dat")
    >>> len(db)
    15011

Of course, you will have to do that each time you want to analyze your
history.dat file.

-- 
Skip Montanaro (skip at mojam.com)
Support the Mojam.com Affiliates Program: http://www.mojam.com/affl/
(847)971-7098




More information about the Python-list mailing list