[Tutor] use gzip with large files

Adam Bark adam.jtm30 at gmail.com
Tue Jul 19 17:34:25 CEST 2005


If you use something like this:

for line in file.readlines():

then line is a string to the next newline and it automatically detects the 
EOF and the same with file.readline() but that will give you one character 
at a time.

On 7/19/05, frank h. <frank.hoffsummer at gmail.com> wrote:
> 
> hello all
> I am trying to write a script in python that parses a gzipped logfile
> 
> the unzipped logfiles can be very large (>2GB)
> 
> basically the statements
> 
> file = gzip.GzipFile(logfile)
> data = file.read()
> 
> for line in data.striplines():
> ....
> 
> 
> would do what I want, but this is not feasible becasue the gzip files
> are so huge.
> 
> So I do file.readline() in a for loop, but have no idea how long to
> continue, because I dont know how many lines the files contain. How do
> I check for end of file when using readline() ?
> simply put it in a while loop and enclose it with try: except: ?
> 
> what would be the best (fastest) approach to deal with such large gzip
> files in python?
> 
> thanks
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20050719/ad12f5c2/attachment-0001.htm


More information about the Tutor mailing list