raise UnicodeError, "label too long"

Flavio fccoelho at gmail.com
Wed Jan 24 19:25:19 EST 2007


something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide

but even url with any non-ascii characters such as this

http://.wikipedia.org/wiki/Ammonia

also fail when passed to urlopen :
File "/usr/lib/python2.4/encodings/idna.py", line 72, in ToASCII
    raise UnicodeError, "label too long"
UnicodeError: label too long

very strange, because I tried other unicode urls  from the python
console like this

urllib2.urlopen(u'www.google.com')

and it works normally:





Martin v. Löwis escreveu:
> Flavio schrieb:
> > What I am doing is very simple:
> >
> > I fetch an url (html page) parse it using BeautifulSoup, extract the
> > links and try to open each of the links, repeating the cycle.
> >
> > Beautiful soup converts the html to unicode. That's why when I try to
> > open the links extracted from the page I get this error.
> >
> > This is bad, since some links do contain strings with non-ascii
> > characters.
>
> Please try answering the exact question that Marc asked:
> what is an example for unicode string that triggers the
> exception?
> 
> Regards,
> Martin




More information about the Python-list mailing list