problems writing utf8

Boudewijn Rempt boud at valdyas.org
Fri Apr 12 16:41:27 EDT 2002


It started when I wanted to write utf-8 data to standard output.
I've got an utf-8 enabled konsole, so I didn't want any of the
default escapes print uses.

However, that didn't work -- giving errors because some of my data
went way beyond ascii ordinal 128.

Then I tried to write the utf-8 data to a file. I have tried to
construct that file with two methods:

    f = open("syllables", "w+")
    d2 = codecs.EncodedFile(f, "unicode_internal", "utf-8")
    f2.write(u"a")
    f2.close()

This segfaults when Python exits:

Python 2.2.1 (#1, Apr 12 2002, 22:26:33)
[GCC 2.95.3 20010315 (SuSE)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import codecs
>>> f = open("syllables", "w+")
>>> f2 = codecs.EncodedFile(f, "unicode_internal", "utf-8"
... )
>>> f2.write(u"a")
>>> f2.close()
>>>
Segmentation fault

The second method writes garbage-encoded data to the file:

    f3 = codecs.open("syllables2", "w+", "utf-8")
    f3.write(u"?")
    f3.close()

(Where u"?" contains any Unicode character you like -- in this
case a glottal stop.)

I'm sure I must be doing something very silly -- in the second case.
As for the first, I rather thought Python didn't segfault... 

-- 
Boudewijn Rempt | http://www.valdyas.org



More information about the Python-list mailing list