Opening and reading randomly text file with umlauts and accents truncates output - unicode how?
synthespian
synthespian at uol.com.br
Thu Jan 31 23:11:16 EST 2002
Hi -
I have this text file with some German words. They show nouns. The point
is displaying nouns in the singular form (first word after das, below),
and asking the user to input what the correct form of the definite
article is (der, die, or das - the list bellow is just a sample).
Sample:
das Appartement, Appartements
das Auge, Augen
das Bad, Bäder
das Bein, Beine
das Beispiel, Beispiele
das Buch, Bücher
das Büro, Büros
das Café, Cafés
So then I wrote this little script:
There may be a more elegant solution, but it does what I want it to do,
__except__ for the fact that when it comes to words with accents or
umlauts (Café, Büro)m the output from
print m.group(2)
is truncated! Like "Caf" or "B".
I'm using python 1.5.2 (Debian), but before you shout "Godammit you
acid-head, don't you know better than using 1.5.2?!" I'd like to know
precisely __how__ I am to use Unicode support in this code (yes, I
acknowledge I have to upgrade, I'll do it __tonight__, but please answer
the Unicode part, if you can).
TIA to all the fine people out there,
Spread the Love
synthespian at uol.com.br
#!/usr/bin/env python
import re
from random import randint
#filename = raw_input ('Enter file name: ')
#file = open(filename, 'r')
file = open('/home/xxxxxx/yyyy/shortwort.txt', 'r')
allLines = file.readlines()
file.close()
listSize = len(allLines)
listLine = randint (1, listSize-1)
p = re.compile('^(der|die|das(\s\w+))')
print listLine, "\n"
print allLines[listLine], "\n"
m = p.search(allLines[listLine])
print m.group(0),"m.group(0)\n"
print m.group(1),"m.group(1)\n"
print m.group(2),"m.group(2)\n" # This is word in singular, w/out
definite article
print m.group(1)[0:3], "\n" # Not elegant, but does the trick: print
only the article
answerString = m.group(1)[0:3]
print answerString, "This is the answer string\n"
print 'What is the definite article related to: ', m.group(2), '?\n'
antwort = raw_input('Answer: ')
if antwort == answerString:
print "Richtig!"
else:
print "You're soooo wrong, dude!"
More information about the Python-list
mailing list