unicode compare errors

Ross rossgk at gmail.com
Fri Dec 10 14:51:44 EST 2010


I've a character encoding issue that has stumped me (not that hard to
do). I am parsing a small text file with some possibility of various
currencies being involved, and want to handle them without messing up.

Initially I was simply doing:

  currs = [u'$', u'£', u'€', u'¥']
  aFile = open(thisFile, 'r')
  for mline in aFile:		   # mline might be "£5.50"
     if item[0] in currs:
	  item = item[1:]

But the problem was:
   SyntaxError: Non-ASCII character '\xa3' in file

The remedy was of course to declare the file encoding for my Python
module, at the start of the file I used:

# -*- coding: UTF-8 -*-

That allowed me to progress. But now when I come to line item that is
a non $ currency, I get this error:

views.py:3364: UnicodeWarning: Unicode equal comparison failed to
convert both arguments to Unicode - interpreting them as being
unequal.

…which I think means Python's unable to convert the char's in the file
I'm reading from into unicode to compare to the items in the list
currs.

I think this is saying that u'£' == '£' is false.
(I hope those chars show up okay in my post here)

Since I can't control the encoding of the input file that users
submit, how to I get past this?  How do I make such comparisons be
True?

Thanks in advance for any suggestions
Ross.






More information about the Python-list mailing list