Unicode list

Rehceb Rotkiv rehceb at no.spam.plz
Sat Mar 31 20:36:20 EDT 2007


Hello,

I have this little grep-like program:

++++++++++snip++++++++++
#!/usr/bin/python

import sys
import re

pattern = sys.argv[1]
inputfile = file(sys.argv[2], 'r')

for line in inputfile:
    matches = re.findall(pattern, line)
    if matches:
        print matches
++++++++++snip++++++++++

Like this, the program prints some characters as strange escape 
sequences, which is due to the input file being encoded in utf-8: When I 
convert "re.findall..." to a string and wrap an "unicode()" around it, 
the matches get printed correctly. Is it possible to make "matches" 
unicode without saving it as a single string first? The function "unicode
()" seems only to work for strings. Or is there a general way of telling 
Python to abandon the ancient and evil land of iso-8859 for good and use 
utf-8 only?

Regards,
Rehceb



More information about the Python-list mailing list