Unicode BOM marks
Steve Horsley
shoot at the.moon
Sun Mar 13 19:19:07 EST 2005
Martin v. Löwis wrote:
> Steve Horsley wrote:
>
>> It is my understanding that the BOM (U+feff) is actually the Unicode
>> character "Non-breaking zero-width space".
>
>
> My understanding is that this used to be the case. According to
>
> http://www.unicode.org/faq/utf_bom.html#38
>
> the application should now specify specific processing, and both
> simply dropping it, or reporting an error are both acceptable behaviour.
> Applications that need the ZWNBSP behaviour (i.e. want to indicate that
> there should be no break at this point) should use U+2060 (WORD JOINER).
>
> Regards,
> Martin
I'm out of date, then. Thanks for the link.
Steve
More information about the Python-list
mailing list