Python object overhead?

John Nagle nagle at animats.com
Sat Mar 24 02:26:37 EDT 2007


Matt Garman wrote:
> I'm trying to use Python to work with large pipe ('|') delimited data
> files.  The files range in size from 25 MB to 200 MB.
> 
> Since each line corresponds to a record, what I'm trying to do is
> create an object from each record.  However, it seems that doing this
> causes the memory overhead to go up two or three times.

    Why do you want all the records in memory at once?  Are you
doing some lookup on them, or what?  If you're processing files
sequentially, don't keep them all in memory.

    You're getting into the size range where it may be time to
use a database.

				John Nagle



More information about the Python-list mailing list