[Patches] [ python-Patches-1453235 ] IDNA codec simplification

SourceForge.net noreply at sourceforge.net
Mon Mar 20 22:25:40 CET 2006


Patches item #1453235, was opened at 2006-03-18 16:52
Message generated for change (Settings changed) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1453235&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Library (Lib)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Walter Dörwald (doerwalter)
>Assigned to: Walter Dörwald (doerwalter)
Summary: IDNA codec simplification

Initial Comment:
This patch simplifies the idna codec. It moves the
encode and decode functionality out of the Codec class,
so that it can be reused by the stateless and the
incremental codecs. (See patch #1436130 for the history
of this patch).

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2006-03-19 11:52

Message:
Logged In: YES 
user_id=89016

You're right, this patch doesn't change the codecs behaviour
at all (the StreamWriter has to same problem:
import sys, codecs

w = codecs.getwriter("idna")(sys.stdout)
w.write(u"dör")
w.write(u"wald.de")
)

I'll try to come up with a patch that implements a real
incremental encoder and decoder.


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2006-03-19 11:13

Message:
Logged In: YES 
user_id=21627

This patch is wrong (AFAICT). If I understand the
incremental API correctly, it should be possible to pass
chunks of the input to the incremental encoder;
concatenating the results should give the same string as if
I had passed the entire input at once. This doesn't work for
this patch:

py> u"dörwald.de".encode("idna")
'xn--drwald-wxa.de'
py> c = codecs.getincrementalencoder("idna")()
py> c.encode(u"dör")
'xn--dr-fka'
py> c.encode(u"wald.de")
'wald.de'

So in the first case, I get the (correct) result
'xn--drwald-wxa.de'; in the second case, I get the incorrect
result 'xn--dr-fkawald.de'.

To properly encode IDNA incrementally, you need to process
an entire label at a time (i.e. between two dots, 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=1453235&group_id=5470


More information about the Patches mailing list