Enable unicode

Chris Angelico rosuav at gmail.com
Tue Jan 28 04:00:42 EST 2014


On Tue, Jan 28, 2014 at 7:56 PM, Igor Korot <ikorot01 at gmail.com> wrote:
> Hi, Chris,

Hi! I'm hoping it was oversight that led to this email coming to me
personally instead of to the list, and hoping that you won't mind me
responding on-list.

> On Tue, Jan 28, 2014 at 12:35 AM, Chris Angelico <rosuav at gmail.com> wrote:
>> On Tue, Jan 28, 2014 at 7:26 PM, Igor Korot <ikorot01 at gmail.com> wrote:
>>> Hi, ALL,
>>> In here: http://stackoverflow.com/questions/21397035/set-utf8-on-mysql,
>>> I got a suggestion to enable "use_unicode".
>>> Problem is I'm developing on Windows and it's not that I can recompile
>>> my python.
>>> I'm using Python2.7 on Windows XP.
>>>
>>> Any pointer on how do I enable "use_unicode"?
>>
>> Before you go any further: MySQL has a broken interpretation of "utf8"
>> that allows only a subset of the full Unicode range. Instead, use
>> "utf8mb4", which is what the rest of the world calls UTF-8. As far as
>> I know, you can just switch in utf8mb4 everywhere that you're
>> currently using utf8 and it'll work.
>
> So instead of using 'utf8' just use 'utf8mb4'?

Yes, that's right. Unless utf8mb4 isn't supported, in which case try
utf8 and see if you can use the full range (something might be
translating it for you, which would probably be a good thing).

>> According to [1] the use_unicode flag is a keyword parameter to
>> connect(). As much as possible, I'd recommend using those parameters
>> rather than explicitly executing SQL statements to reconfigure the
>> connection - it's clearer, and the local client might want to
>> reconfigure itself in response to the change too.
>
> Is it supported on all versions of MySQLDB?

No idea! I don't use MySQLDB, so just give it a shot and see if it works.

>> Be aware that MySQL has a number of issues with Unicode and sorting
>> (or at least, it did the last time I checked, which was a while ago
>> now), not to mention other problems with its default MyISAM format.
>> You may want to consider PostgreSQL instead.
>
> I'm not using MyISAM, only InnoDB. ;-)

That's good, but it doesn't cover everything. You may find that
non-ASCII strings get mis-sorted.

I strongly prefer PostgreSQL for anything where I actually care about
the data I'm storing. And yes, that's everything that I store. So I
don't use MySQL anywhere any more :)

> So, how do I properly write the connection lines?

I've no idea - I don't actually use MySQLDB, I just looked at the docs
:) But try adding use_unicode=True to your connect() call.

ChrisA



More information about the Python-list mailing list