Python's handling of unicode surrogates

Rhamphoryncus rhamph at gmail.com
Fri Apr 20 22:12:29 EDT 2007


On Apr 20, 5:49 pm, Ross Ridge <rri... at caffeine.csclub.uwaterloo.ca>
wrote:
> Rhamphoryncus  <rha... at gmail.com> wrote:
> >The only code that will be changed is that which doesn't handle
> >surrogates properly.  Some will start working properly.  Some (ie
> >random.choice(u'\U00100000\uFFFF')) will fail explicitly (rather than
> >silently).
>
> You're falsely assuming that any code that doesn't support surrogates
> is broken.  Supporting surrogates is no more required than supporting
> combining characters, right-to-left languages or lower case letters.

No, I'm only assuming code which would raise an error with my change
is broken.  My change would have minimal effect because it's building
on the existing correct way to do things, expanding them to handle
some additional cases.

--
Adam Olsen, aka Rhamphoryncus




More information about the Python-list mailing list