A critique of cgi.escape

Georg Brandl g.brandl-nospam at gmx.net
Sun Sep 24 06:41:14 EDT 2006


Lawrence D'Oliveiro wrote:
> In message <mailman.518.1159087749.10491.python-list at python.org>, Fredrik
> Lundh wrote:
> 
>> Jon Ribbens wrote:
>> 
>>> Making cgi.escape always escape the '"' character would not break
>>> anything, and would probably fix a few bugs in existing code. Yes,
>>> those bugs are not cgi.escape's fault, but that's no reason not to
>>> be helpful. It's a minor improvement with no downside.
>> 
>> the "improvement with no downside" would bloat down the output for
>> everyone who's using the function in the intended way, and will also
>> break unit tests.
> 
> I don't understand this "bloat down" nonsense. Any tests that would break
> are obviously testing the wrong thing.

" is 4 characters more than ".

>>  > One thing that is flat-out wrong, by the way, is that cgi.escape()
>>  > does not encode the apostrophe (') character.
>> 
>> it's intentional, of course: you're supposed to use " if you're using
>> cgi.escape(s, True) to escape attributes.
> 
> Attributes can be quoted with either single or double quotes. That's what
> the HTML spec says. cgi.escape doesn't correctly allow for that. Ergo,
> cgi.escape is broken. QED.

A function is broken if its implementation doesn't match the documentation.

As a courtesy, I've pasted it below.

escape(s[, quote])
     Convert the characters "&", "<" and ">" in string s to HTML-safe sequences. 
Use this if you need to display text that might contain such characters in HTML. 
If the optional flag quote is true, the quotation mark character (""") is also 
translated; this helps for inclusion in an HTML attribute value, as in <A 
HREF="...">. If the value to be quoted might include single- or double-quote 
characters, or both, consider using the quoteattr() function in the 
xml.sax.saxutils module instead.


Now, do you still think cgi.escape is broken?


Georg



More information about the Python-list mailing list