A Freudian slip of *EPIC PROPORTIONS*!

Chris Angelico rosuav at gmail.com
Thu Nov 13 18:35:31 EST 2014


On Fri, Nov 14, 2014 at 10:11 AM, Rick Johnson
<rantingrickjohnson at gmail.com> wrote:
>     # The parse functions have no idea what to do with
>     # Unicode, so replace all Unicode characters with "x".
>     # This is "safe" so long as the only characters germane
>     # to parsing the structure of Python are 7-bit ASCII.
>     # It's *necessary* because Unicode strings don't have a
>     # .translate() method that supports deletechars.

Sounds to me like the functions that collapse whitespace to single
spaces, or turn all letters into "A" and all digits into "9", or
lowercase/casefold all alphabetics, or strip diacriticals, or anything
else of that nature. It's often simpler to fold equivalencies together
before parsing or comparing strings. It doesn't mean you don't respect
Unicode; in fact, it proves that you *do*.

So if you stop calling Unicode "vile", you might actually learn something.

ChrisA



More information about the Python-list mailing list