very large dictionary

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Fri Aug 1 06:25:00 EDT 2008


Simon Strobl:
> I had a file bigrams.py with a content like below:
> bigrams = {
> ", djy" : 75 ,
> ", djz" : 57 ,
> ", djzoom" : 165 ,
> ", dk" : 28893 ,
> ", dk.au" : 854 ,
> ", dk.b." : 3668 ,
> ...
> }
> In another file I said:
> from bigrams import bigrams

Probably there's a limit in the module size here. You can try to
change your data format on disk, creating a text file like this:
", djy" 75
", djz" 57
", djzoom" 165
...
Then in a module you can create an empty dict, read the lines of the
data with:
for line in somefile:
  part, n = .rsplit(" ", 1)
  somedict[part.strip('"')] = int(n)

Otherwise you may have to use a BigTable, a DB, etc.


> If there is no other way to do it, I will have to learn how to use
> databases in Python. I would prefer to be able to use the same type of
> scripts with data of all sizes, though.

I understand, I don't know if there are documented limits for the
dicts of the 64-bit Python.

Bye,
bearophile



More information about the Python-list mailing list