Removing Unicode from Python?

jack jackh at sparkingwire.com
Tue Nov 4 13:53:52 EST 2003


On 29 Oct 2003 23:12:39 -0800, Paradox wrote:

> In general I love Python for text manipulation but at our company we
> have the need to manipulate large text values stored in either a SQL
> Server database or text files. This data is stored in a "text" field
> type and is definitely not unicode though it is often very strange
> text since it is either OCR or some kinda electronic file extraction.
> Unfortunately when it is retrieved into a string type in python it is
> invariably a unicode type string. The best I can do is try and encode
> it to 'latin-1' but that will often throw and error if I use the
> ignore parameter then it will wack my data with a bunch of "?". I am
> just not understanding why python is thinking stuff is unicode and why
> it is failing on conversion. There is no way that a byte can not be
> between 0 and 255 right? This problem can be so haunting that I will
> start to wish I had coded the solution in VB where at least a string
> is a string is a string. Is there a way to modify Python so that all
> strings will always be single byte strings since we have no need for
> Unicode support? Any solutions or suggestions to my biggest Python
> annoyance would be greatly appreciated.
> 
>                 Thanks Joey

i had a simpilar problem with SQL Server. my solution was to create a
sitecustomize.py file containing:

import sys
sys.setdefaultencoding("utf-8")


this works for me and turns off unicode for everything. i was unable to
find any other solution that i could understand. (i'm not a programmer and
have only just started with python).

jack
sidelined in order to prevent discrimination on the gender front




More information about the Python-list mailing list