readlines() reading incorrect number of lines?

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Thu Dec 20 16:13:40 EST 2007


[Fixing top-posting.]

On Thu, 20 Dec 2007 12:41:44 -0800, Wojciech Gryc wrote:

> On Dec 20, 3:30 pm, John Machin <sjmac... at lexicon.net> wrote:
[snip]
>> > However, when I use Python's various methods -- readline(),
>> > readlines(), or xreadlines() and loop through the lines of the file,
>> > the line program exits at 16,000 lines. No error output or anything
>> > -- it seems the end of the loop was reached, and the code was
>> > executed successfully.
...
>> One possibility: you are running this on Windows and the file contains
>> Ctrl-Z aka chr(26) aka '\x1a'.
> 
> Hi,
> 
> Python 2.5, on Windows XP. Actually, I think you may be right about \x1a
> -- there's a few lines that definitely have some strange character
> sequences, so this would make sense... Would you happen to know how I
> can actually fix this (e.g. replace the character)? Since Python doesn't
> see the rest of the file, I don't even know how to get to it to fix the
> problem... Due to the nature of the data I'm working with, manual
> editing is also not an option.
> 
> Thanks,
> Wojciech
> 


Open the file in binary mode:

open(filename, 'rb')


and Windows should do no special handling of Ctrl-Z characters.



-- 
Steven



More information about the Python-list mailing list