[Expat-discuss] Parser query
Kev Buckley
k.buckley@lancaster.ac.uk
Wed, 15 Aug 2001 15:55:18 +0100 (BST)
Hello,
I'm trying to understand what characters the expat parser is throwing
when it reads the numerical entity code for the &#szlig; and &#oslah;
characters.
Bit of background.
I'm dumping data off a Palm which produces
octal 337 (dec 223) for the &#szlig;
370 (dec 248) for the &#oslash;
But if I have ß and ø in my XML doc, when expat parses the
doc I seem to get back two bytes as follows:
octal 303 + octal 237 for the ß
octal 303 + octal 248 for the ø
Other "extended shift" characters from the Palm seem to translate OK,
in that their &#nnn; codes get parsed as
octal 302 + octal NNN
where NNN equates to the decimal nnn used in the numeric entity code.
eg, the trademark symbol
octal 231 (dec 153) -> ™ which is parsed as
octal 302 + octal 231
If the above makes any sense to anyone, then do you have any clues as
to what I have missed in trying to get these characters out of
(through) the expat parser ?
Kevin
--
Regards,
----------------------------------------------------------------------
* Kevin M. Buckley e-mail: K.Buckley@lancaster.ac.uk *
* *
* Systems Administrator *
* Computer Centre *
* Lancaster University Voice: +44 (0) 1524 5 93718 *
* LANCASTER. LA1 4YW Fax : +44 (0) 1524 5 25113 *
* England. *
* *
* My PC runs Linux/GNU, you still computing the Bill Gate$' way ? *
----------------------------------------------------------------------