[Python-3000] Raw strings containing \u or \U

Guido van Rossum guido at python.org
Wed May 16 23:10:17 CEST 2007


On 5/16/07, Ron Adam <rrr at ronadam.com> wrote:
> Guido van Rossum wrote:
> > That would be great! This will automatically turn \u1234 into 6
> > characters, right?
>
> I'm not exactly clear when the '\uxxxx' characters get converted.  There
> isn't any conversion done in tokanize.c that I can see.  It's primarily
> only concerned with finding the beginning and ending of the string at that
> point.  It looks like everything between the beginning and end is just
> passed along "as is" and it's translated further later in the chain.

OK, I think that happens in a totally different place. But it also
needs to be fixed. :-)

> (I had said earlier tokanize.py,  meant tokanize.c)

Well, actually, tokenize.py also needs adjustments to support this...

> > Perhaps you could make the patch against the py3k-struni branch
> > instead of against the regular p3yk (sic) branch?
>
> I can do that.  :-)

Great!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list