REALLY simple xml reader

Diez B. Roggisch deets at nospam.web.de
Thu Jan 31 11:35:39 EST 2008


Steven D'Aprano schrieb:
> On Fri, 01 Feb 2008 00:40:01 +1100, Ben Finney wrote:
> 
>> Quite apart from a human thinking it's pretty or not pretty, it's *not
>> valid XML* if the XML declaration isn't immediately at the start of the
>> document <URL:http://www.w3.org/TR/xml/#sec-prolog-dtd>. Many XML
>> parsers will (correctly) reject such a document.
> 
> You know, I'd really like to know what the designers were thinking when 
> they made this decision.
> 
> "You know Bob, XML just isn't hostile enough to anyone silly enough to 
> believe it's 'human-readable'. What can we do to make it more hostile?"
> 
> "Well Fred, how about making the XML declaration completely optional, so 
> you can leave it out and still be vald XML, but if you include it, you're 
> not allowed to precede it with semantically neutral whitespace?"
> 
> "I take my hat off to you."
> 
> 
> This is legal XML:
> 
> """<?xml version="1.0"?>
> <greeting>Hello, world!</greeting>"""
> 
> and so is this:
> 
> """
>      <greeting       >Hello, world!</greeting    >"""
> 
> 
> but not this:
> 
> """ <?xml version="1.0"?>
> <greeting>Hello, world!</greeting>"""
> 
> 
> You can't get this sort of stuff except out of a committee.


do not forget to mention that trailing whitespace is evil as well!!!

And that for some reason some characters below \x20 aren't allowed in 
XML even if it's legal utf-8 - for what reason I'd really like to know...

Diez



More information about the Python-list mailing list