codecs, csv issues

George Sakkis george.sakkis at gmail.com
Fri Aug 22 09:52:42 EDT 2008


I'm trying to use codecs.open() and I see two issues when I pass
encoding='utf8':

1) Newlines are hardcoded to LINEFEED (ascii 10) instead of the
platform-specific byte(s).

    import codecs
    f = codecs.open('tmp.txt', 'w', encoding='utf8')
    s = u'\u0391\u03b8\u03ae\u03bd\u03b1'
    print >> f, s
    print >> f, s
    f.close()

This doesn't happen for the default encoding (=None).

2) csv.writer doesn't seem to work as expected when being passed a
codecs object; it treats it as if encoding is ascii:

    import codecs, csv
    f = codecs.open('tmp.txt', 'w', encoding='utf8')
    s = u'\u0391\u03b8\u03ae\u03bd\u03b1'
    # this works fine
    print >> f, s
    # this doesn't
    csv.writer(f).writerow([s])
    f.close()

Traceback (most recent call last):
...
    csv.writer(f).writerow([s])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0391' in
position 0: ordinal not in range(128)

Is this the expected behavior or are these bugs ?

George



More information about the Python-list mailing list