Removing Unicode from Python?

Brian Quinlan brian at sweetapp.com
Thu Oct 30 02:55:42 EST 2003


> In general I love Python for text manipulation but at our company we
> have the need to manipulate large text values stored in either a SQL
> Server database or text files. This data is stored in a "text" field
> type and is definitely not unicode though it is often very strange
> text since it is either OCR or some kinda electronic file extraction.
> Unfortunately when it is retrieved into a string type in python it is
> invariably a unicode type string. The best I can do is try and encode
> it to 'latin-1' but that will often throw and error if I use the
> ignore parameter then it will wack my data with a bunch of "?". I am
> just not understanding why python is thinking stuff is unicode and why
> it is failing on conversion. There is no way that a byte can not be
> between 0 and 255 right? This problem can be so haunting that I will
> start to wish I had coded the solution in VB where at least a string
> is a string is a string. Is there a way to modify Python so that all
> strings will always be single byte strings since we have no need for
> Unicode support? Any solutions or suggestions to my biggest Python
> annoyance would be greatly appreciated.

What module are you using to access your database? Are you positive that
the column that you are accessing is of type "text" (and not, say,
"ntext")?

Cheers,
Brian






More information about the Python-list mailing list