[python-uk] Favourite ways of scrubbing HTML/whitelisting specific HTML tags?

Michael Foord fuzzyman at voidspace.org.uk
Thu Feb 7 19:06:09 CET 2008


Michael Sparks wrote:
> On Thursday 07 February 2008 15:48:46 Jon Ribbens wrote:
>   
>> The code at
>> http://www.voidspace.org.uk/python/weblog/arch_d7_2005_04_23.shtml#e35
>> is wrong, for example.
>>     
>
> That's because it whitelists a collection of tags but doesn't whitelist 
> specific attributes, I presume.
>
> I can certainly adapt that code to work the way I'd prefer it.
>
> Changing allowed_tags to something like:
> allowed_tags = {
>    'a' : ["id", "name", "href"],
>    'img' : ["id", "src"],
>    ..
>    <tag> : [ <list of allowed attributes> ]
> }
>
> Would allow that code to be used with only a small modification, if I'm 
> reading your objection right.
>   

Ahhh... sounds entirely plausible.

Michael




More information about the python-uk mailing list