xml parsing escape characters

Kent Johnson kent3737 at yahoo.com
Thu Jan 20 12:33:49 EST 2005


Irmen de Jong wrote:
> Kent Johnson wrote:
> [...]
> 
>> This is an XML document containing a single tag, <string>, whose 
>> content is text containing entity-escaped XML.
>>
>> This is *not* an XML document containing tags <DataSet>, <Order>, 
>> <Customer>, etc.
>>
>> All the behaviour you are seeing is a consequence of this. You need to 
>> unescape the contents of the <string> tag to be able to treat it as 
>> structured XML.
> 
> 
> The unescaping is usually done for you by the xml parser that you use.

Yes, so if your XML contains for example
<stuff><not a tag></stuff>

and you parse this and ask for the *text* content of the <stuff> tag, you will get the string
"<not a tag>"

but it's still *not* a tag. If you try to get child elements of the <stuff> element there will be none.

This is exactly the confusion the OP has.

> 
> --Irmen



More information about the Python-list mailing list