[ python-Bugs-1331062 ] utf 7 codec broken

SourceForge.net noreply at sourceforge.net
Wed Oct 19 10:23:24 CEST 2005


Bugs item #1331062, was opened at 2005-10-19 08:23
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1331062&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Ralf Schmitt (titty)
Assigned to: M.-A. Lemburg (lemburg)
Summary: utf 7 codec broken

Initial Comment:
the following code doesn't work as expected:

ralf at stronzo:~$ cat t.py
#! /usr/bin/env python

s = 'Auguste and Louis Lumi\xe8re'
print repr(s)
u1 = s.decode('utf7')
print 'from utf7: %d %r' % (len(u1), u1)
u2 = u'Auguste and Louis Lumi\xe8re'
print '       u2: %d %r' % (len(u2), u2)

print 'u1==u2', u1==u2

e1 = u1.encode('utf8')
e2 = u2.encode('utf8')

print 'e1=%r' % e1
print 'e2=%r' % e2

unicode(e2, 'utf8')
unicode(e1, 'utf8')
ralf at stronzo:~$ python t.py
'Auguste and Louis Lumi\xe8re'
from utf7: 25 u'Auguste and Louis Lumi\xe8re'
       u2: 25 u'Auguste and Louis Lumi\xe8re'
u1==u2 False
e1='Auguste and Louis Lumi\xff\xbf\xbf\xa8re'
e2='Auguste and Louis Lumi\xc3\xa8re'
Traceback (most recent call last):
  File "t.py", line 19, in ?
    unicode(e1, 'utf8')
  File "/usr/local/lib/python2.4/encodings/utf_8.py",
line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xff
in position 22: unexpected code byte


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1331062&group_id=5470


More information about the Python-bugs-list mailing list