[Tutor] Logical error?

Dave Angel davea at davea.name
Sun May 4 14:48:59 CEST 2014


Danny Yoo <dyoo at hashcollision.org> Wrote in message:
> 
> 
> 
> 
>> Hopefully, this makes the point clearer: we must not try to decode
>> individual lines.  By that time, the damage has been done: the act of
>> trying to break the file into lines by looking naively at newline byte
>> characters is invalid when certain characters can themselves have
>> newline characters.
> 
> Confusing last sentence.  Let me try that again.  The act of trying to
> break the file into lines by looking naively at newline byte
> characters is invalid because certain characters, under encoding,
> themselves consist of newline characters.  We've got to open the file
> with the right encoding in play.
> 
> 


When the file is encoded, it's a binary file until you decode it.
 You should never use readline or equivalent on a binary file. 
 Some encodings go out of their way to make it seem to work,  but
 taking advantage of such details leaves you at risk when a new
 file having a different encoding comes along.


-- 
DaveA



More information about the Tutor mailing list