How to store ASCII encoded python string?

Jean-Paul Calderone exarkun at divmod.com
Mon Aug 28 17:01:43 EDT 2006


On 28 Aug 2006 13:51:58 -0700, micahc at gmail.com wrote:
>Fredrik Lundh wrote:
>> 3) convert the data to Unicode before passing it to the database
>> interface, and leave it to the interface to convert it to whatever
>> encoding your database uses:
>>
>>      data = ... get encoded string from email ...
>>      text = data.decode("iso-8859-1")
>>      ... write text to database ...
>
>Wouldn't that have to assume that all incoming data is in iso-8859-1?
>If someone sends me an email with chinese characters would that still
>work (I don't know the character set at data insert time)?
>

Yes.  All byte streams are valid ISO-8859-1.  For clarity, you may want
to use the codec name "charmap" instead.  It is identical to ISO-8895-1,
but implies no actual encoding to someone reading the source code.

Another solution which might be better is to select a data type for this
column which can handle arbitrary bytes.  This will let you avoid mangling
the input completely.  Different databases have different column types
for handling this.  For PostreSQL, you might want to look at BYTEA.

Jean-Paul



More information about the Python-list mailing list