Removing Unicode from Python?

"Martin v. Löwis" martin at v.loewis.de
Thu Oct 30 03:09:27 EST 2003


Paradox wrote:

> In general I love Python for text manipulation but at our company we
> have the need to manipulate large text values stored in either a SQL
> Server database or text files. This data is stored in a "text" field
> type and is definitely not unicode though it is often very strange
> text since it is either OCR or some kinda electronic file extraction.
> Unfortunately when it is retrieved into a string type in python it is
> invariably a unicode type string. The best I can do is try and encode
> it to 'latin-1' but that will often throw and error if I use the
> ignore parameter then it will wack my data with a bunch of "?".

Can you give an example of such string? Reporting its repr() would help.

If you want to encode arbitrary Unicode strings into byte strings, you
can use "utf-8" as the encoding.

Regards,
Martin





More information about the Python-list mailing list