Eclipse/PyDev - BOM Lexical Error

Lawrence D'Oliveiro ldo at geek-central.gen.new_zealand
Mon Oct 11 03:04:52 EDT 2010


In message <mailman.1533.1286774527.29448.python-list at python.org>, Ethan 
Furman wrote:

> Lawrence D'Oliveiro wrote:
>
>> In message <mailman.1466.1286556950.29448.python-list at python.org>, Ethan
>> Furman wrote:
>> 
>>>Lawrence D'Oliveiro wrote:
>>>
>>>>But they can only recognize it as a BOM if they assume UTF-8 encoding to
>>>>begin with. Otherwise it could be interpreted as some other coding.
>>>
>>>Not so.  The first three bytes are the flag.
>> 
>> But this is just a text file. All parts of its contents are text, there
>> is no “flag”.
>> 
>> If you think otherwise, then tell us what are these three “flag” bytes
>> for a Windows-1252-encoded text file?
> 
> MS treats those first three bytes as a flag -- if they equal the BOM, MS
> treats it as UTF-8, if they equal anything else, MS does not treat it as
> UTF-8.

So what does it treat it as? You previously gave examples of flag values for 
dBase III. What are the flag values for Windows-1252, versus, say, 
ISO-8859-15?



More information about the Python-list mailing list