unicode html
Duncan Booth
duncan.booth at invalid.invalid
Tue Jul 18 11:21:50 EDT 2006
Sybren Stuvel wrote:
> Duncan Booth enlightened us with:
>> Don't bother using named entities. If you encode your unicode as
>> ascii replacing all non-ascii characters with the xml entity
>> reference then your pages will display fine whatever encoding is
>> specified in the HTTP headers.
>
> Which means OP can't use Unicode/UTF-8 entity references, since that's
> not specified in the HTTP header.
>
That doesn't matter, character references are not affected by the network
encoding.
>From http://www.w3.org/TR/html4/charset.html#h-5.3.1
> 5.3.1 Numeric character references
>
> Numeric character references specify the code position of a character
> in the document character set.
The character references use the *document character set*, which is
independant of the character encoding used for network transmission. This
is defined for HTML as ISO10646, and (section 5.1) "The character set
defined in [ISO10646] is character-by-character equivalent to Unicode
([UNICODE])".
More information about the Python-list
mailing list