A critique of cgi.escape

Lawrence D'Oliveiro ldo at geek-central.gen.new_zealand
Sat Sep 23 08:00:16 EDT 2006


The "escape" function in the "cgi" module escapes characters with special
meanings in HTML. The ones that need escaping are '<', '&' and '"'.
However, cgi.escape only escapes the quote character if you pass a second
argument of True (the default is False):

    >>> cgi.escape("the \"quick\" & <brown> fox")
    'the "quick" & <brown> fox'
    >>> cgi.escape("the \"quick\" & <brown> fox", True)
    'the "quick" & <brown> fox'

This seems to me to be dumb. The default option should be the safe one: that
is, escape _all_ the potentially troublesome characters. The only time you
can get away with NOT escaping the quote character is outside of markup,
e.g.

    <TEXTAREA>
    unescaped "quotes" allowed here
    </TEXTAREA>

Nevertheless, even in that situation, escaped quotes are acceptable.

So I think the default for the second argument to cgi.escape should be
changed to True. Or alternatively, the second argument should be removed
altogether, and quotes should always be escaped.

Can changing the default break existing scripts? I don't see how. It might
even fix a few lurking bugs out there.



More information about the Python-list mailing list