[Python-Dev] email package Bytes vs Unicode (was Re: Dropping bytes "support" in json)

Tony Nelson tonynelson at georgeanelson.com
Thu Apr 9 19:14:21 CEST 2009


(email-sig dropped, as I didn't see Steve Holden's message there)

At 12:20 -0400 04/09/2009, Steve Holden wrote:
>Tony Nelson wrote:
 ...
>> If you need the data from the message, by all means extract it and store it
>> in whatever form is useful to the purpose of the database.  If you need the
>> entire message, store it intact in the database, as the bytes it is.  Email
>> isn't Unicode any more than a JPEG or other image types (often payloads in
>> a message) are Unicode.
>
>This is all great, and I did quite quickly realize that the best
>approach was to store the mails in their network byte-stream format as
>bytes. The approach was negated in my own case because of PostgreSQL's
>execrable BLOB-handling capabilities. I took a look at the escaping they
>required, snorted with derision and gave it up as a bad job.
 ...

I use MySQL, but sort of intend to learn PostgreSQL.  I didn't know that
PostgreSQL has no real support for BLOBs.  I agree that having to import
them from a file is awful.  Also, there appears to be a severe limit on the
size of character data fields, so storing in Base64 is out.  About the only
thing to do then is to use external storage for the BLOBs.

Still, email seems to demand such binary storage, whether all databases
provide it or not.
-- 
____________________________________________________________________
TonyN.:'                       <mailto:tonynelson at georgeanelson.com>
      '                              <http://www.georgeanelson.com/>


More information about the Python-Dev mailing list