Python object overhead?
John Nagle
nagle at animats.com
Sat Mar 24 02:26:37 EDT 2007
Matt Garman wrote:
> I'm trying to use Python to work with large pipe ('|') delimited data
> files. The files range in size from 25 MB to 200 MB.
>
> Since each line corresponds to a record, what I'm trying to do is
> create an object from each record. However, it seems that doing this
> causes the memory overhead to go up two or three times.
Why do you want all the records in memory at once? Are you
doing some lookup on them, or what? If you're processing files
sequentially, don't keep them all in memory.
You're getting into the size range where it may be time to
use a database.
John Nagle
More information about the Python-list
mailing list