increasing the page size of a dbm store?

Peter Otten __peter__ at web.de
Wed Nov 27 06:50:54 EST 2019


Tim Chase wrote:

> Working with the dbm module (using it as a cache), I've gotten the
> following error at least twice now:
> 
>   HASH: Out of overflow pages.  Increase page size
>   Traceback (most recent call last):
>   [snip]
>   File ".py", line 83, in get_data
>     db[key] = data
>   _dbm.error: cannot add item to database
> 
> I've read over the py3 docs on dbm
> 
> https://docs.python.org/3/library/dbm.html
> 
> but don't see anything about either "page" or "size" contained
> therein.
> 
> There's nothing particularly complex as far as I can tell.  Nothing
> more than a straightforward
> 
>   import dbm
>   with dbm.open("cache", "c") as db:
>     for thing in source:
>       key = extract_key_as_bytes(thing)
>       if key in db:
>         data = db[key]
>       else:
>         data = long_process(thing)
>         db[key] = data
> 
> The keys can get a bit large (think roughly book-title length), but
> not huge. I have 11k records so it seems like it shouldn't be
> overwhelming, but this is the second batch where I've had to nuke the
> cache and start afresh.  Fortunately I've tooled the code so it can
> work incrementally and no more than a hundred or so requests have to
> be re-performed.
> 
> How does one increas the page-size in a dbm mapping?  Or are there
> limits that I should be aware of?
> 
> Thanks,
> 
> -tkc
> 
> PS: FWIW, this is Python 3.6 on FreeBSD in case that exposes any
> germane implementation details.

I found the message here

https://github.com/lattera/freebsd/blob/master/lib/libc/db/hash/hash_page.c#L695

but it's not immedately obvious how to increase the page size, and the 
readme

https://github.com/lattera/freebsd/tree/master/lib/libc/db/hash

only states

"""
"bugs" or idiosyncracies

If you have a lot of overflows, it is possible to run out of overflow
pages.  Currently, this will cause a message to be printed on stderr.
Eventually, this will be indicated by a return error code.
"""

what you learned the hard way. 

Python has its own "dumb and slow but simple dbm clone" dbm.dump -- maybe 
it's smart and fast enough for your purpose?



More information about the Python-list mailing list