[Python-Dev] Encoding of code in XML

David Ascher DavidA@ActiveState.com
Mon, 17 Apr 2000 15:51:21 -0700


> Hmm.  I think the way everybody does it is to use the language
> to get around the need for ever saying "]]>".  For example, in
> Python, if that was outside of a string, you could insert some
> spaces without changing the meaning, or if it was inside a string,
> you could add two strings together etc.

> You're right that this seems a bit ugly, but i think it could be
> even harder to get all the language communities to swallow
> something like "replace all occurrences of ]]> with some ugly
> escape string" -- since the above (hackish) method has the
> advantage that you can just run code directly copied from a piece
> of CDATA, and now you're asking them all to run the CDATA through
> some unescaping mechanism beforehand.

But it has the bad disadvantages that it's language-specific and modifies
code rather than encode it.  It has the even worse disadvantage that it
requires you to parse the code to encode/decode it, something much more
expensive than is really necessary!

> Although i'm less optimistic about the success of such a standard,
> i'd certainly be up for it, if we had a good answer to propose.

I'm thinking that if we had a good answer, we can probably get it into the
core libraries for a few good languages, and document it as 'the standard',
if we could get key people on board.

> Here is one possible answer

Right, that's the sort of thing I was looking for.

> def escape(text):
>     cdata = replace(text, "@@", "@@@")
>     cdata = replace(cdata, "]]>", "@@>")
>     return cdata
>
> def unescape(cdata):
>     text = replace(cdata, "@@>", "]]>")
>     text = replace(text, "@@@", "@@")
>     return text

(the above fails on @@>, but that's the general idea I had in mind).

--david

I know!:  "]]>" <==> "Microsoft engineers are puerile weenies!"