Removing Unicode from Python?

Paradox JoeyTaj at netzero.com
Thu Oct 30 13:25:12 EST 2003


Brian Quinlan <brian at sweetapp.com> wrote in message news:<mailman.232.1067500493.702.python-list at python.org>...
> > In general I love Python for text manipulation but at our company we
> > have the need to manipulate large text values stored in either a SQL
> > Server database or text files. This data is stored in a "text" field
> > type and is definitely not unicode though it is often very strange
> > text since it is either OCR or some kinda electronic file extraction.
> > Unfortunately when it is retrieved into a string type in python it is
> > invariably a unicode type string. The best I can do is try and encode
> > it to 'latin-1' but that will often throw and error if I use the
> > ignore parameter then it will wack my data with a bunch of "?". I am
> > just not understanding why python is thinking stuff is unicode and why
> > it is failing on conversion. There is no way that a byte can not be
> > between 0 and 255 right? This problem can be so haunting that I will
> > start to wish I had coded the solution in VB where at least a string
> > is a string is a string. Is there a way to modify Python so that all
> > strings will always be single byte strings since we have no need for
> > Unicode support? Any solutions or suggestions to my biggest Python
> > annoyance would be greatly appreciated.
> 
> What module are you using to access your database? Are you positive that
> the column that you are accessing is of type "text" (and not, say,
> "ntext")?
> 
> Cheers,
> Brian

Yes I am positive. I will try to come up with a sample next time.
Isn't utf-8 the same as latin-1. I think I have tried both. As far as
VB strings really being unicode I think it is irrelevant cause vb
doesn't wack my data or throw exceptions.




More information about the Python-list mailing list