[Tutor] Re: Regex (almost solved, parser problem remaining)

Erik Price erikprice at mac.com
Sat Aug 30 10:41:02 EDT 2003


On Friday, August 29, 2003, at 01:31  PM, Andrei wrote:

> With my tests it works OK except for one thing: the SGML parser chokes 
> on "&". The original says:
>
>    go to news://bl_a.com/?ha-ha&query=tb for more info
>
> and it's modified to:
>
>    go to <a 
> href="news://bl_a.com/?ha-ha">news://bl_a.com/?ha-ha</a>&query=tb for 
> more info
>
> The bit starting with "&" doesn't make it into the href attribute and 
> hence also falls outside the link.
> The parser recognizes "&" as being the start of an entity ref or a 
> char ref and by default would like to remove it entirely. I use 
> handle_entityref to put it back in, but at that time the preceding 
> text has already been run through the linkify method, which generates 
> the links. This means that the "&query" stuff is added behind the 
> generated link instead of inside it.

I thought that it was invalid to have a "&" character that is not being 
used as an entity reference in SGML or XML documents.  In other words, 
the document should have "&amp;", not "&".  Most browsers don't enforce 
it in HTML, though.


Erik




More information about the Tutor mailing list