[Python-Dev] Encoding of code in XML

Sjoerd Mullender sjoerd@oratrix.nl
Wed, 19 Apr 2000 11:51:53 +0200


What is wrong with encoding ]]> in the XML way by using an extra
CDATA.  In other words split up the CDATA section into two in the
middle of the ]]> sequence:

import string
def encode_cdata(str):
    return '<![CDATA[' + \
	   string.join(string.split(str, ']]>'), ']]]]><![CDATA[>')) + \
	   ']]>'

On Mon, Apr 17 2000 "David Ascher" wrote:

> Lots of projects embed scripting & other code in XML, typically as CDATA
> elements.  For example, XBL in Mozilla.  As far as I know, no one ever
> bothers to define how one should _encode_ code in a CDATA segment, and it
> appears that at least in the Mozilla world the 'encoding' used is 'cut &
> paste', and it's the XBL author's responsibility to make sure that ]]> is
> nowhere in the JavaScript code.
> 
> That seems suboptimal to me, and likely to lead to disasters down the line.
> 
> The only clean solution I can think of is to define a standard
> encoding/decoding process for storing program code (which may very well
> contain occurences of ]]> in CDATA, which effectively hides that triplet
> from the parser.
> 
> While I'm dreaming, it would be nice if all of the relevant language
> communities (JS, Python, Perl, etc.) could agree on what that encoding is.
> I'd love to hear of a recommendation on the topic by the XML folks, but I
> haven't been able to find any such document.
> 
> Any thoughts?
> 
> --david ascher
> 
> 

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>