Question: processing HTML, re-write default processing action of many tags
Alex Martelli
aleaxit at yahoo.com
Fri Sep 17 04:56:34 EDT 2004
Hubert Hung-Hsien Chang <hubert at cs.nyu.edu> wrote:
> I know you could use the
>
>
> def start_a
> ....
>
> def end_a
> ....
>
> to process the <a href=...> anchor </a> tags, but is there a
> default method for processing ALL tags? If I just want change
> some parts of the hyperlink and want to keep other parts of the HTML
> could I just print them out? There should be such a method.
> Can't find it...
You could subclass HTMLParser.HTMLParser and override handle_starttag
and handle_endtag (also, if needed, handle_charref, handle_entityref,
and last but not least handle_data -- that's assuming that while you
only talk about processing _tags_ you may in fact also want to process
references and text nodes... possibly handle_comment, too, btw).
Alex
More information about the Python-list
mailing list