raise UnicodeError, "label too long"
Flavio
fccoelho at gmail.com
Wed Jan 24 19:25:19 EST 2007
something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide
but even url with any non-ascii characters such as this
http://.wikipedia.org/wiki/Ammonia
also fail when passed to urlopen :
File "/usr/lib/python2.4/encodings/idna.py", line 72, in ToASCII
raise UnicodeError, "label too long"
UnicodeError: label too long
very strange, because I tried other unicode urls from the python
console like this
urllib2.urlopen(u'www.google.com')
and it works normally:
Martin v. Löwis escreveu:
> Flavio schrieb:
> > What I am doing is very simple:
> >
> > I fetch an url (html page) parse it using BeautifulSoup, extract the
> > links and try to open each of the links, repeating the cycle.
> >
> > Beautiful soup converts the html to unicode. That's why when I try to
> > open the links extracted from the page I get this error.
> >
> > This is bad, since some links do contain strings with non-ascii
> > characters.
>
> Please try answering the exact question that Marc asked:
> what is an example for unicode string that triggers the
> exception?
>
> Regards,
> Martin
More information about the Python-list
mailing list