[Python-Dev] Security implications of pep 383
Lennart Regebro
regebro at gmail.com
Tue Mar 29 22:45:43 CEST 2011
On Tue, Mar 29, 2011 at 22:40, Lennart Regebro <regebro at gmail.com> wrote:
> The lesson here seems to be "if you have to use blacklists, and you
> use unicode strings for those blacklists, also make sure the string
> you compare with doesn't have surrogates".
>
For that matter, what happens with combining characters?
'\N{LATIN SMALL LETTER O}\N{COMBINING DIAERESIS}' != '\N{LATIN SMALL
LETTER O WITH DIAERESIS}'
I guess the filesystem shouldn't treat these as the same (even though
they are), but what if some webservice does? I suspect you should
normalize both strings before comparing them in any blacklist, and
what happens with surrogates when you normalize?
//Lennart
More information about the Python-Dev
mailing list