Eclipse/PyDev - BOM Lexical Error

Ethan Furman ethan at stoneleaf.us
Sun Oct 10 22:35:27 EDT 2010


Lawrence D'Oliveiro wrote:
> In message <mailman.1466.1286556950.29448.python-list at python.org>, Ethan 
> Furman wrote:
> 
> 
>>Lawrence D'Oliveiro wrote:
>>
>>
>>>But they can only recognize it as a BOM if they assume UTF-8 encoding to
>>>begin with. Otherwise it could be interpreted as some other coding.
>>
>>Not so.  The first three bytes are the flag.
> 
> 
> But this is just a text file. All parts of its contents are text, there is 
> no “flag”.
> 
> If you think otherwise, then tell us what are these three “flag” bytes for a 
> Windows-1252-encoded text file?

MS treats those first three bytes as a flag -- if they equal the BOM, MS 
treats it as UTF-8, if they equal anything else, MS does not treat it as 
UTF-8.

If you think otherwise, hop on an MS machine and test it out.

~Ethan~



More information about the Python-list mailing list