Python nuube needs Unicode help

gheissenberger at gmail.com gheissenberger at gmail.com
Thu Jan 11 18:42:07 EST 2007


  Progress! You managed to change the error message.

  File "./acc_test_script_generator.py", line 106, in loadData
    print u.encode('utf-8')
AttributeError: Utterance instance has no attribute 'encode'

I'm missing somethign really obvious here, but I don't know what it
is...


Diez B. Roggisch wrote:
> gheissenberger at gmail.com schrieb:
> > HELP!
> > Guy who was here before me wrote a script to parse files in Python.
> >
> > Includes line:
> > print u
> > where u is a line from a file we are parsing.
> > However, we have started recieving data from Brazil. If I open file to
> > parse in VI, looks like:
> >
> > <Utt id="3" transcribe="yes" audioRoot="A1"
> > audio="313-20070102144528.wav" grammarSet="G3" rawText="não"
> > recValue="{data:CHOICE=NO;}" conf="970" rawText2="" conf2="0"
> > transcribedText="não" parsableText="não"/
> >
> > Clearly those "n&#227" are some non-Ascii characters, but how do I get
> > print to understand that?
> >
> > I keep getting:
> > "UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in
> > position 40:
> >  ordinal not in range(128)"
> >
>
> Does the error happen at the
>
> print u
>
> line? If yes, what happens is that you try and print a unicode object.
> Which means that it has to be converted (actually the right term is
> encoded) to a byte-string. If you don't do that explicitely, it will be
> done implicitly, using the default encoding - which is ascii.
>
> If you have non-ascii characters, you end up with the error you see.
>
> What to do? Use something like this:
> 
> print u.encode('utf-8')
> 
> instead.
> 
> Diez




More information about the Python-list mailing list