htmlparser charrefs

Robin Becker robin at reportlab.com
Thu Dec 22 08:43:46 EST 2016


For various reasons I am using HTMLParser to do transcription of some xml. I 
need to keep charrefs as is so for Python > 3.4 I pass in

convert_charrefs =False

to the constructor.

This seems to work as intended for data, but I notice that a change in Python 
3.4 prevents me from keeping the charrefs which are in attribute strings.

Is it intentional that we can no longer use HTMLParser.unescape? It seems to 
prevent correct interpretation of the convert_charrefs constructor argument.

The unescaping is done, but in a module level function which means I can no 
longer override that functionality safely.
-- 
Robin Becker




More information about the Python-list mailing list