Python program that validates an url against w3c markup validator

Fredrik Lundh fredrik at pythonware.com
Wed Nov 29 01:58:50 EST 2006


yaru22 wrote:

> I'd like to create a program that validates bunch of urls against the
> w3c markup validator (http://validator.w3.org/) and store the result in
> a file.
> 
> Since I don't know network programming, I have no idea how to start
> coding this program.
> 
> I was looking at the python library and thought urllib or urllib2 may
> be used to make this program work.
> 
> But I don't know how to send my urls to the w3c validator and get the
> result.

this should get you going, I think:

 >>> import urllib
 >>> uri = "http://www.python.org"
 >>> f = urllib.urlopen("http://validator.w3.org/check?uri=" + uri)
 >>> print f.headers
Date: Wed, 29 Nov 2006 06:52:33 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) mod_perl/1.999.21 Perl/v5.8.4
Content-Language: en
X-W3C-Validator-Recursion: 1
X-W3C-Validator-Status: Valid
X-W3C-Validator-Errors: 0
Connection: close
Content-Type: text/html; charset=utf-8
 >>> print f.headers["x-w3c-validator-status")
Valid

 >>> uri = "http://www.cnn.com"
 >>> f = urllib.urlopen("http://validator.w3.org/check?uri=" + uri)
 >>> print f.headers["x-w3c-validator-status"]
Invalid
 >>> print f.headers["x-w3c-validator-errors"]
39

</F>




More information about the Python-list mailing list