[I18n-sig] Unicode surrogates: just say no!

Tim Peters tim.one@home.com
Thu, 28 Jun 2001 00:55:08 -0400


[Guido]
> Unfortunately, the surrogate-creating behavior of \U is present in
> 2.0 and 2.1, so I think we can't reasonably remove this from narrow
> Python 2.2

[Paul Prescod]
> I'm having a hard time caring about backwards compatibilty much here.
> And I can't square it with your enthusiasm for ripping the guts out of
> poor old xrange. <wink>

But there's a HUGE difference.  The xrange() behaviors we're seeking to shed
have been documented for years.  But the Python 2.1 Reference Manual's
section on Unicode literals reads:

    2.4.3 Unicode literals
    XXX explain more here...

in its entirety, and the word "surrogate" appears nowhere at all.  Well, OK,
it's mentioned twice in unicode.txt, both times in a disclaimer sense ("we
don't need no stinkin' surrogates -- and neither do you").

See?  I thought you would, if someone just paused to explain it <wink>.

> ...
> Let's just put a deprecation warning in for \U where you've asked for a
> character larger than your build's code unit size.

More consideration than it merits, if anyone were silly enough tp ask me.