iterating over a file with two pointers

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed Sep 18 09:09:17 EDT 2013


On 18 September 2013 13:56, Roy Smith <roy at panix.com> wrote:
>
>> > On Wed, Sep 18, 2013 at 9:12 PM, nikhil Pandey <nikhilpandey90 at gmail.com>
>> > wrote:
>> >> hi,
>> >> I want to iterate over the lines of a file and when i find certain lines,
>> >> i need another loop starting from the next of that "CERTAIN" line till a
>> >> few (say 20) lines later.
>> >> so, basically i need two pointers to lines (one for outer loop(for each
>> >> line in file)) and one for inner loop. How can i do that in python?
>> >> please help. I am stuck up on this.
>> [...]
>
> In article <mailman.115.1379504419.18130.python-list at python.org>,
>  Dave Angel <davea at davea.name> wrote:
> [I hope I unwound the multi-layer quoting right]
>> In addition, is this really a text file?  For binary files, you could
>> use seek(), and manage things yourself.  But that's not strictly legal
>> in a text file, and may work on one system, not on another.
>
> Why is seek() not legal on a text file?  The only issue I'm aware of is
> the note at http://docs.python.org/2/library/stdtypes.html, which says:
>
> "On Windows, tell() can return illegal values (after an fgets()) when
> reading files with Unix-style line-endings. Use binary mode ('rb') to
> circumvent this problem."
>
> so, don't do that (i.e. read unix-line-terminated files on windows).
> But assuming you're not in that situation, it seems like something like
> this this should work:
>
>> I'd suggest you open the file twice, and get two file objects.  Then you
>> can iterate over them independently.

There's no need to use OS resources by opening the file twice or to
screw up the IO caching with seek(). Peter's version holds just as
many lines as is necessary in an internal Python buffer and performs
the minimum possible amount of IO. I would expect this to be more
efficient as well as less error-prone on Windows.


Oscar



More information about the Python-list mailing list