Unicode (UTF8) in dbhas on 2.5

"Martin v. Löwis" martin at v.loewis.de
Tue Oct 21 20:06:14 EDT 2008


>> Many database engines are encoding-aware, and distinguish between
>> 'text' columns and 'blob' columns -- the latter are arbitrary bags
>> of bytes, but text columns store text, and a good database (with a
>> sensibly designed database) will be aware of this and handle
>> encoding and decoding of text responsibly.

Ok, by this definition, the dbm interface of Unix is not a good
database. Tough luck.

>> I can tell you that in REALbasic, if your database is properly 
>> configured to use UTF-8 encoding, the rest is all handled
>> seamlessly -- you just store and retrieve text, and don't have to
>> worry about encoding and decoding things all over the place.

In Python, the database system is independent of the programming
language. Python can deal with

>> So the OP's request is quite valid.

Which of the questions specifically?

Q: Can you put UTF-8 characters in a dbhash in python 2.5 ?
A: Sure, certainly.

Q: Do I need to change the bsd db library,
or there is no way to make it work with python 2.5 ?
A: You don't need to change the bsd db library; it works out
of the box.

Q: What about python 2.6 ?
A: It's the same.

He got essentially the answers to the questions he asked.

>> Python's handling of encodings is currently primitive compared to
>> some other environments, and I see that this extends to the
>> database modules.

That's *not* a question that he had asked. He asked about UTF-8, but
perhaps meant to ask about Unicode (in particular as his example did
demonstrate any problems with UTF-8 encoded strings).

>> Fine, fair enough, it is what it is, but there is no harm in asking
>> about (or even yearning for) a more intelligent system that does
>> more of the grunt work for us.

It *is* important to understand the difference between an "UTF-8
string", and a "Unicode string". If the OP hadn't been confused
about the two, and fully understood the difference, he probably
wouldn't have needed to ask.

Regards,
Martin



More information about the Python-list mailing list