[Expat-discuss] Handling arbitrary bytes in CDATA marked
sections...
David Crowley
dcrowley@scitegic.com
Thu Dec 6 09:46:10 2001
At 11:34 PM 12/5/2001, Andrew.Nesbit@CSIRO.AU wrote:
>Hi ho, if somebody could help me with this problem, I'd really be
>appreciative :-)
>
>Basically, what I wan't to do is parse a document which includes CDATA
>marked sections. The thing is that I want the CDATA marked sections to be
>able to contain arbitrary 8-bit bytes (i.e. binary data). I do realise
>that this makes the document a non-XML document, but I do not want to have
>to use any encoding system on it. I need to read these bytes raw, (i.e.
>not cook them into UTF-8 or anything), so they can be stored in an array
>of unsigned chars or shorts or something.
>
>Can somebody please give me some hints on how I can do this?
This should be a FAQ. Do yourself a favor and base64 encode it. It's
really not hard, it's not slow, it preserves the XML, it makes the
representation only 30% larger, and you don't have to go making ugly hacks
in the code that nobody else is interested in. If your using XML then USE
XML. Don't bastardize it.
>I am prepared to do some hacking on the source to get this effect.
Please please please don't do that.
>Thankyou!
>-Andrew Nesbit