encoding error

Terry Reedy tjreedy at udel.edu
Wed Feb 20 01:13:09 EST 2013


On 2/19/2013 8:07 PM, halagamal2009 at gmail.com wrote:
> UnicodeEncodeError: 'decimal' codec can't encode character u'\ufeff'
> in position 0: invalid decimal Unicode string

I believe that is a byte-order mark, which should only be the first 2 
bytes in the file and which should be removed if you use the proper 
decoder when reading the file, before parsing it.

You did not say what version of Python you used, but I would use 3.3 or 
if not that, 3.2 if possible.
http://pypi.python.org/pypi/Whoosh/
claims that whoosh works with python 3.

Also, read about the basics of unicode if you have not done so yet.

-- 
Terry Jan Reedy




More information about the Python-list mailing list