python 2.7 and unicode (one more time)

Thu Nov 20 09:14:16 EST 2014

Hi,

On 11/20/2014 11:47 AM, Chris Angelico wrote:
> On Thu, Nov 20, 2014 at 8:40 PM, Francis Moreau <francis.moro at gmail.com> wrote:
>> My question is: how should this be fixed properly ?
>>
>> A simple solution would be to force all strings passed to the
>> logger to be unicode:
>>
>>   log.debug(u"%s: %s" % ...)
>>
>> and more generally force all string in my code to be unicode by
>> using the 'u' prefix.
> 
> Yep. And then you may want to consider "from __future__ import
> unicode_literals", which will make string literals represent Unicode
> strings rather than byte strings. Basically the same as you're saying,
> only without the explicit u prefixes.

Thanks for the "from __future__ import unicode_literals" trick, it makes
that switch much less intrusive.

However it seems that I will suddenly be trapped by all modules which
are not prepared to handle unicode. For example:

 >>> from __future__ import unicode_literals
 >>> import locale
 >>> locale.setlocale(locale.LC_ALL, 'fr_FR')
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/lib64/python2.7/locale.py", line 546, in setlocale
     locale = normalize(_build_localename(locale))
   File "/usr/lib64/python2.7/locale.py", line 453, in _build_localename
     language, encoding = localetuple
 ValueError: too many values to unpack

Is the locale module an exception and in that case I'll fix it by doing:

 >>> locale.setlocale(locale.LC_ALL, b'fr_FR')

or is a (big) part of the modules in python 2.7 still not ready for
unicode and in that case I have to decide which prefix (u or b) I should
manually add ?

Thanks.