[ python-Bugs-909230 ] bug in idna-encoding-module

Wed Mar 24 12:00:55 EST 2004

Bugs item #909230, was opened at 2004-03-03 19:13
Message generated for change (Settings changed) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=909230&group_id=5470

Category: Python Library
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Rumpeltux (rumpeltux)
Assigned to: Martin v. Löwis (loewis)
Summary: bug in idna-encoding-module

Initial Comment:
in /usr/lib/python2.3/encodings/idna.py, line 175 it goes:
lables = input.split(&#039;.&#039;)
which causes the interpreter to stop executing the
program, but by changing it to
labels = dots.split(input)
everything&#039;s fine ;)

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2004-03-24 18:00

Message:
Logged In: YES 
user_id=21627

Found it. PyUnicode_FromEncodedObject converts the string object to 
char*/len, then calls PyUnicode_Decode. This special-cases UTF-8, Latin
-1 and ASCII, then creates a buffer object and passes it to 
PyCodec_Decode.

Even if it might be possible to pass the string directly to the codec, the 
codec still has to deal with buffer objects, for direct callers of 
PyUnicode_Decode. So I leave the fix as-is, added a test-case 
(test_codecs.py 1.10), and close this as fixed.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-03-24 17:40

Message:
Logged In: YES 
user_id=21627

This is now fixed in idna.py 1.4 and 1.2.12.2, by converting input to a 
string object. I leave this open to find out why there is a buffer object in 
the first place.

----------------------------------------------------------------------

Comment By: Rumpeltux (rumpeltux)
Date: 2004-03-23 17:17

Message:
Logged In: YES 
user_id=989758

>>> unicode('xn--mller-kva.de', 'idna')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.3/encodings/idna.py", line 175, in
decode
    labels = input.split(".")
AttributeError: 'buffer' object has no attribute 'split'

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-03-22 23:26

Message:
Logged In: YES 
user_id=21627

I can't see any problem in the code. The invocation of unicode() is 
correct - we just look for the exception that call may raise.

Rumpeltux, can you please report the exact input and exception you get?

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2004-03-05 20:52

Message:
Logged In: YES 
user_id=33168

Martin, it looks like line 174: unicode(input, "ascii")
should be input = unicode(input, "ascii").
I'm not sure what's supposed to happenning here, but it
looks like the if/else code block may be able to be
rewritten as:

if not isinstance(input, unicode):
    input = unicode(input, "ascii")
labels = dots.split(input)

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=909230&group_id=5470