[XML-SIG] (Py)DOM: Character References
Lars Marius Garshol
larsga@ifi.uio.no
19 Mar 1999 09:15:30 +0100
* Carsten Oberscheid
|
| Ok, since charrefs encode only characters from the document's base
| character set (Unicode for XML, ASCII for SGML -- is that right?)
No. XML uses Unicode, but since XML is SGML (an SGML application
profile, to be correct), it follows that this isn't true. And in fact
SGML as a meta-language does not have a fixed document character set.
In fact, the SGML declaration allows you to define your own character
set in terms of well-known character sets.
So, SGML can use Unicode/ISO 10646, as for example HTML 4.0 does[1],
but it can also use any other character set which consists of
well-known characters. It also has standard ways of handling
characters that are not in the character sets. However, I don't think
it can handle every character encoding, but I might be wrong.
--Lars M.
[1] <URL:http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html>