Ann: Validating Emails and HTTP URLs in Python

andrew cooke andrew at acooke.org
Mon May 3 10:13:58 EDT 2010


> FYI, Fourthought's PyXML has a module called uri.py that contains  
> regexes for URL validation. I've over a million URLs (harvested from  
> the Internet) through their code. I can't say I checked each and every  
> result, but I never saw anything that would lead me to believe it was  
> misbehaving.
>
> It might be interesting to compare the results of running a large list  
> of URLs through your code and theirs.
>
> Good luck
> Philip

It's getting a set of URLs that's the main problem.  I've tested it
with URL examples in RFC 3696, and with a few extra ones that test
particular issues, but when I looked around I couldn't find any
public, obvious list of URLs for general testing.  Could I use your
list?

Also, same for emails...

Cheers,
Andrew


Cheers,
Andrew



More information about the Python-list mailing list