[Python-Dev] urllib unicode handling

Robert Brewer fumanchu at aminus.org
Wed May 7 17:55:34 CEST 2008


"Martin v. Löwis" wrote:
> The proper way to implement this would be IRIs (RFC 3987),
> in particular section 3.1. This is not as simple as just
> encoding it as UTF-8, as you might have to apply IDNA to
> the host part.
> 
> Code doing so just hasn't been contributed yet.

But if someone wanted to do so, it's pretty simple:

>>> u'www.\u212bngstr\xf6m.com'.encode("idna")
'www.xn--ngstrm-hua5l.com'


Robert Brewer
fumanchu at aminus.org



More information about the Python-Dev mailing list