A critique of cgi.escape

Duncan Booth duncan.booth at invalid.invalid
Mon Sep 25 10:25:51 EDT 2006


Jon Ribbens <jon+usenet at unequivocal.co.uk> wrote:

> In article <Xns984996E6BABCEduncanbooth at 127.0.0.1>, Duncan Booth
> wrote: 
>> It is generally a principle of Python that new releases maintain
>> backward compatability. An incompatible change such proposed here
>> would probably break many tests for a large number of people.
> 
> Why is the suggested change incompatible? What code would it break?
> I agree that it would be a bad idea if it did indeed break backwards
> compatibility - but it doesn't.

I guess you've never seen anyone write tests which retrieve some generated 
html and compare it against the expected value. If the page contains any 
unescaped quotes then this change would break it.

> 
>> There should be a one-stop shop where I can take my unicode text and 
>> convert it into something I can safely insert into a generated html
>> page; 
> 
> I disagree. I think that doing it in one is muddled thinking and
> liable to lead to bugs. Why not keep your output as unicode until it
> is ready to be output to the browser, and encode it as appropriate
> then? Character encoding and character escaping are separate jobs with
> separate requirements that are better off handled by separate code.

Sorry, convert into something I can safely insert wasn't meant to imply 
encoding: just entity escaping.

To be clear:

I'm talking about encoding certain characters as entity references. It 
doesn't matter whether its the character ampersand or right double quote, 
they both want to be converted to entities. Same operation.

The resulting string might be a byte string or it might still be unicode: 
the point being that the conversion I want is from unescaped to entity 
escaped, not from unicode to byte encoded. Right now the only way the 
Python library gives me to do the entity escaping properly has a side 
effect of encoding the string. I should be able to do the escaping without 
having to encode the string at the same time.



More information about the Python-list mailing list