[shelve] What are the limitations? Entering too many data crashes it on my machine!

Richard Walkington richard at stockcontrol.net
Thu Dec 26 19:07:35 EST 2002


F. GEIGER wrote:

>I've written a file syncher (well, kind of), which uses walk() to walk dir
>trees.
>
>I wanted to look at other possibilities to solve the problem, so I thought
>of a dict, which holds file properties (i.e. timestamp and size), where the
>pathname is the key. This dict would be delivered by an object of class
>DirectoryInfo or the like. Having two of them (one for the source, one for
>the target), I could make predictions like: "12345 files will be copied,
>because they are newer on source", "678 file will be copied, because they
>are missing on target" etc.
>
>Of course, if you supply the ctor of DirectoryInfo with "D:\\" and this
>drive is a 30MB drive filled up to 3/4 of its size with files, a normal dict
>would require quite a lot of mem.
>
>So I decided to drop in shelve.
>
>But suddenly, after having added 19220 entries an error "(0, 'Error')" is
>reported, when executing the statement
>
>self._fileInfos[str(pn)] = FileInfoNode(pn)
>
>I catch the exception, synch the shelve and retry the operation. This time
>it succeeds (BTW, synching the shelve or not does not change anything here,
>but I hoped it'd prevent the script from the final crash - see below).
>
>The same error occurs a second time, after having added 25913 entries.
>
>Then, after having added 36006 entries the script crashes, because suddenly
>the synch() method is no more recognized by the dict:
>'''
>File
>"D:\Lab\Design_Patterns.Python\Structural_Patterns.GoF\Composite__Directory.
>py", line 89, in _fileInfosAdd_
>   self._fileInfos.sync()
>File "C:\Programme\Python21\lib\shelve.py", line 94, in sync
>   self.dict.sync()
>bsddb.error: (22, 'Invalid argument')"
>'''
>
>If I do not synch, calling any other method causes this crash (e.g.
>print len(self._fileInfos.keys()) )
>
>The size of the file shelve stores the data in is about 1.00 GB (1 084 782
>592 Bytes).
>
>You might say, that my solution is not appropriate for this task, use a real
>db, or at least use walk(). But that's not the point. The point is, what did
>I do wrong with shelve?
>
>Is 1 GB an implicit limit here? If so, what are those two "non-errors"
>occurring much earlier?
>
>Can anybody help to resolve this?
>
>Many thanks in advance and best regards
>Franz GEIGER
>
>
>P.S.: ActivePython 2.1.3 on W2k, but ActivePython 2.2.1 on WinXP yields the
>same "results".
>
>
>  
>
Using the little test program below with Python 2.2 on Windows XP I get 
the same problem. Installing BerkeleyDB 3.1 from 
http://pybsddb.sourceforge.net/ and using the bsddb3.shelve module 
instead of shelve seems to work fine.

Regards
Richard Walkington
C Base Systems

import shelve

d=shelve.open("db.tst")
for i in xrange(10000):
    try:
        if not d.has_key(str(i)):
            d[str(i)]="This is a big string"
    except (Exception,),e:
        print e
        print "Record ",i
        sys.exit(1)
   






More information about the Python-list mailing list