[XML-SIG] Normalized AttVals

John Day jday@csihq.com
Mon, 14 Dec 1998 16:20:14 -0500


Forgive my ignorance of Python and the XML standards, but I 
am confused by the behavior of pyexpat.

Re: quoted attribute contents ("AttVal")
When '>' is encountered e.g. <code op=">"> it is "normalized"
to '&gt;', however, when '&' is encountered it is a fatal
error e.g. <a href="www.zzz.com?a=1&b=3">

Is this pyexpat behavior correct? Why can't the parser tell that
'&b' above is _not_ a defined entity because it is not terminated
by ';'? It seems to me that this usage could be normalized to
'&amp;b', just like pyexpat did for '>'. Then it would be backward
compatible with HTML (sort of).

The impact of this seems to be enormous. All of the existing HTML
parameter generators will have to change the way they post arguments,
when HTML is replaced by XML, right?

-jday