[Tutor] use gzip with large files
Adam Bark
adam.jtm30 at gmail.com
Tue Jul 19 17:34:25 CEST 2005
If you use something like this:
for line in file.readlines():
then line is a string to the next newline and it automatically detects the
EOF and the same with file.readline() but that will give you one character
at a time.
On 7/19/05, frank h. <frank.hoffsummer at gmail.com> wrote:
>
> hello all
> I am trying to write a script in python that parses a gzipped logfile
>
> the unzipped logfiles can be very large (>2GB)
>
> basically the statements
>
> file = gzip.GzipFile(logfile)
> data = file.read()
>
> for line in data.striplines():
> ....
>
>
> would do what I want, but this is not feasible becasue the gzip files
> are so huge.
>
> So I do file.readline() in a for loop, but have no idea how long to
> continue, because I dont know how many lines the files contain. How do
> I check for end of file when using readline() ?
> simply put it in a while loop and enclose it with try: except: ?
>
> what would be the best (fastest) approach to deal with such large gzip
> files in python?
>
> thanks
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20050719/ad12f5c2/attachment-0001.htm
More information about the Tutor
mailing list