[Python-porting] Stuck with raw unicode literals

Lennart Regebro regebro at gmail.com
Thu May 23 08:00:54 CEST 2013


On Thu, May 23, 2013 at 7:37 AM, Chitrank Dixit <chitrankdixit at gmail.com> wrote:
> Hello Python developers
>
> I am working on Python Porting 2.7 >=3.3 . unicode literals will be fine but
> still raw unicode literals is an issue (ur' '). I have found the solution in
> six module.
>
> This is the solution
>
> original python 2.7 code
> ur'Pythondev'
>
> using six module for py 2.7>=3.3
> six.u(r'Pythondev')
>
> Does my solution is okay or something else is needed to do I am very
> confused with this.

Well the first question is why you need a raw unicode literal in the
first place?

Raw string literals simply interpret escape sequences literally, so
that '\0x00' is interpreted as a four character string, while in
normal strings it's interpreted as a once character string.

Unicode literals are weird beasts, where most backslash escape
sequences are interpreted literally, but not the Unicode escape
sequences, so '\0x00' is four characters, but '\u0000' is one.

The simplest way to handle it might simply to be to escape the
backslashes, ie change ur'\0x57bla\foo' to u'\\0x57bla\\foo'

//Lennart


More information about the Python-porting mailing list