q: how to output a unicode string?

Frank Stajano usenet423.4.fms at neverbox.com
Tue Apr 24 12:32:01 EDT 2007


A simple unicode question. How do I print?

Sample code:

# -*- coding: utf-8 -*-
s1 = u"héllô wórld"
print s1
# Gives UnicodeEncodeError: 'ascii' codec can't encode character
# u'\xe9' in position 1: ordinal not in range(128)


What I actually want to do is slightly more elaborate: read from a text 
file which is in utf-8, do some manipulations of the text and print the 
result on stdout. I understand I must open the file with

f = codecs.open("input.txt", "r", "utf-8")

but then I get stuck as above.

I tried

s2 = s1.encode("utf-8")
print s2

but got

héllô wórld

Then, in the hope of being able to write the string to a file if not to 
stdout, I also tried


import codecs
f = codecs.open("out.txt", "w", "utf-8")
f.write(s2)

but got

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: 
ordinal not in range(128)

So I seem to be stuck.

I have checked several online python+unicode pages, including

http://boodebr.org/main/python/all-about-python-and-unicode#WHYNOPRINT
http://evanjones.ca/python-utf8.html
http://www.reportlab.com/i18n/python_unicode_tutorial.html
http://www.amk.ca/python/howto/unicode
http://www.example-code.com/python/python-charset.asp
http://docs.python.org/lib/csv-examples.html

but none of them was sufficient to make me understand how to deal with 
this simple problem. I'm sure it's easy, maybe too easy to be worth 
explaining in a tutorial...

Help gratefully received.



More information about the Python-list mailing list