[Tutor] reading very large files

Jerry Hill malaclypse2 at gmail.com
Tue May 17 19:54:53 CEST 2011


On Tue, May 17, 2011 at 1:20 PM, Vikram K <kpguy1975 at gmail.com> wrote:
> I wish to read a large data file (file size is around 1.8 MB) and manipulate
> the data in this file. Just reading and writing the first 500 lines of this
> file is causing a problem. I wrote:
...
>
> Traceback (most recent call last):
>   File
> "H:\genome_4_omics_study\GS000003696-DID\GS00471-DNA_B01_1101_37-ASM\GS00471-DNA_B01\ASM\gene-GS00471-DNA_B01_1101_37-ASM.tsv\test.py",
> line 3, in <module>
>     for i in fin.readlines():
> MemoryError
>
> -------
> is there a way to stop python from slurping all the  file contents at once?

Well, readlines() does read in a whole file at once, splitting it into
a list with one item in the list per line in the file.  I'm surprised
that a 1.8 MB file is causing you to hit a MemoryError though -- that
isn't a very big file.  If you're really just trying to process one
line at a time, you should just use "for i in fin:", which will read
one line at a time into memory.  No need to read the whole thing at
once.

-- 
Jerry


More information about the Tutor mailing list