Encoding of Python 2 string literals

Wed Jul 22 10:12:49 EDT 2015

In a message of Wed, 22 Jul 2015 22:39:56 +1000, Chris Angelico writes:
>On Wed, Jul 22, 2015 at 8:17 PM, anatoly techtonik <techtonik at gmail.com> wrote:
>> Is there a way to know encoding of string (bytes) literal
>> defined in source file? For example, given that source:
>>
>>     # -*- coding: utf-8 -*-
>>     from library import Entry
>>     Entry("текст")
>>
>> Is there any way for Entry() constructor to know that
>> string "текст" passed into it is the utf-8 string?
>
>I don't think so. However, if you declare that to be a Unicode string,
>the parser will decode it using the declared encoding, and it'll be a
>five-character string. At that point, it doesn't matter what your
>source encoding was, because the characters entered will match the
>characters seen.
>
>Entry(u"текст")
>
>ChrisA

Since you are porting to 3.x, anatoly this will be of interest to you.
https://www.python.org/dev/peps/pep-0414/

Having stuck all the u" into your codebase you won't immediately
have to rip them all out again as long as you use Python 3.3 or above.

Laura