Python UTF-8 and codecs
Serge Orlov
serge.orlov at gmail.com
Tue Jun 27 16:29:51 EDT 2006
On 6/27/06, Mike Currie <dev at null.com> wrote:
> I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in
> them. Every configuration I try I get a UnicodeError: ascii codec can't
> decode byte 0x85 in position 255: oridinal not in range(128)
>
> I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', errors='strict')
> and that doesn't work and I've also try wrapping the file in an utf8_writer
> using codecs.lookup('utf8')
>
> Any clues?
Use unicode strings for non-ascii characters. The following program "works":
import codecs
c1 = unichr(0x85)
f = codecs.open('foo.txt', 'wU', 'utf-8')
f.write(c1)
f.close()
But unichr(0x85) is a control characters, are you sure you want it?
What is the encoding of your data?
More information about the Python-list
mailing list