How to change a generator ? - resolved

Barak, Ron Ron.Barak at lsi.com
Thu Dec 25 02:27:23 EST 2008


Hi  Gabriel,

Your remarks fixed my problem. Now my code looks as below, and behaves as expected.

Thanks Gabriel.

Merry Christmas and Happy Hanukkah,
Ron.


$ cat generator.py
#!/usr/bin/env python

import gzip
from Debug import _line as line

class LogStream():

    def __init__(self, filename):
        self.filename = filename
        self.input_file = self.open_file(filename)

    def open_file(self, in_file):
        try:
            f = gzip.GzipFile(in_file, "r")
            f.readline()
        except IOError:
            f = open(in_file, "r")
            f.readline()
        f.seek(0)
        return(f)

    def line_generator(self):
        print line()+". self.input_file.tell()==",self.input_file.tell()
        while True:
            line_ = self.input_file.readline()
            print line()+". self.input_file.tell()==",self.input_file.tell()
            if not line_:
                break
            yield line_.strip()


if __name__ == "__main__":

    filename = "sac.log.50lines"
    log_stream = LogStream(filename)
    log_stream.input_file.seek(0)
    line_generator = log_stream.line_generator()
    line_ = line_generator.next()

$ python generator.py
23. self.input_file.tell()== 0
26. self.input_file.tell()== 247

$ !wc
wc -c sac.log.50lines
6623 sac.log.50lines

$

-----Original Message-----
From: MRAB [mailto:google at mrabarnett.plus.com]
Sent: Wednesday, December 24, 2008 20:00
To: python-list at python.org
Subject: Re: How to change a generator ?

Gabriel Genellina wrote:
> En Wed, 24 Dec 2008 15:03:58 -0200, MRAB <google at mrabarnett.plus.com>
> escribió:
>
>>>  I have a generator whose aim is to returns consecutive lines from a
>>> file (the listing below is a simplified version).
>>> However, as it is written now, the generator method changes the text
>>> file pointer to end of file after first invocation.
>>> Namely, the file pointer changes from 0 to 6623 on line 24.
>>>
>> It might be that the generator method of self.input_file is reading
>> the file a chunk at a time for efficiency even though it's yielding a
>> line at a time.
>
> I think this is the case too.
> I can think of 3 alternatives:
>
> a) open the file unbuffered (bufsize=0). But I think this would
> greatly decrease performance.
>
> b) keep track internally of file position (by adding each line length).
> The file should be opened in binary mode in this case (to avoid any '\n'
> translation).
>
> c) return line numbers only, instead of file positions. Seeking to a
> certain line number requires to re-read the whole file from start;
> depending on how often this is required, and how big is the file, this
> might be acceptable.
>
readline() appears to work as expected, leaving the file position at the start of the next line.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20081225/002e0027/attachment-0001.html>


More information about the Python-list mailing list