[Python-Dev] should we keep the \xnnnn escape in unicode strings?

Finn Bock bckfnn@worldonline.dk
Sun, 16 Jul 2000 12:42:01 GMT


[Fredrik Lundh]

>mal wrote:
>
>> >     1. treat \x as a hexadecimal byte, not a hexadecimal
>> >     character.  or in other words, make sure that
>> >=20
>> >         ord("\xabcd") =3D=3D ord(u"\xabcd")
>> >=20
>> >     fwiw, this is how it's done in SRE's parser (see the
>> >     python-dev archives for more background).
>...
>> >     5. leave it as it is (just fix the comment).
>>=20
>> I'd suggest 5 -- makes converting 8-bit strings using \x
>> to Unicode a tad easier.
>
>if that's the main argument, you really want alternative 1.
>
>with alternative 5, the contents of the string may change
>if you add a leading "u".
>
>alternative 1 is also the only reasonable way to make ordinary
>strings compatible with SRE  (see the earlier discussion for why
>SRE has to be strict on this one...)
>
>so let's change the question into a proposal:
>
>    for maximum compatibility with 8-bit strings and SRE,
>    let's change "\x" to mean "binary byte" in unicode string
>    literals too.

This would potentially break JPython where the \x is already used to
introduce 16-bit chars in ordinary strings. OTOH the implementation of
\x in JPython is so full of bugs and inconsistencies that I'm +1 on your
proposal.

regards,
finn