UnicodeDecodeError quick question

patrick.waldo at gmail.com patrick.waldo at gmail.com
Thu Dec 4 10:57:49 EST 2008


Hi Everyone,

I am using Python 2.4 and I am converting an excel spreadsheet to a
pipe delimited text file and some of the cells contain utf-8
characters.  I solved this problem in a very unintuitive way and I
wanted to ask why.  If I do,

csvfile.write(cell.encode("utf-8"))

I get a UnicodeDecodeError.  However if I do,

c = unicode(cell.encode("utf-8"),"utf-8")
csvfile.write(c)

Why should I have to encode the cell to utf-8 and then make it unicode
in order to write to a text file?  Is there a more intuitive way to
get around these bothersome unicode errors?

Thanks for any advice,
Patrick

Code:

# -*- coding: utf-8 -*-
import xlrd,codecs,os

xls_file = "/home/pwaldo2/work/docpool_plone/2008-12-4/
EU-2008-12-4.xls"
book = xlrd.open_workbook(xls_file)
bibliography_sheet = book.sheet_by_index(0)

csv = os.path.split(xls_file)[0] + '/' + os.path.split(xls_file)[1]
[:-4] + '.csv'
csvfile = codecs.open(csv,'w',encoding='utf-8')

rowcount = 0
data = []
while rowcount<bibliography_sheet.nrows:
    data.append(bibliography_sheet.row_values(rowcount,
start_colx=0,end_colx=None))
    rowcount+=1
for row in data:
    for cell in row:
        #csvfile.write(cell.encode("utf-8"))     This causes the
UnicodeDecodeError
        c = unicode(cell.encode("utf-8"),"utf-8")
        csvfile.write(c)
        csvfile.write('|')
    csvfile.write('\r\n')
csvfile.close()



More information about the Python-list mailing list