Reading only headers

Sun Apr 12 16:42:26 EDT 2009

En Thu, 09 Apr 2009 07:13:08 -0300, S.Selvam <s.selvamsiva at gmail.com>  
escribió:

> I want to read headers from web page and check whether its Content-Type  
> is
> xml or not.I used the following code
>  ...
>  request = urllib2.Request(url, None, USER_AGENT)
>  opener = urllib2.build_opener()
>  datastream = opener.open(request)
>  if datastream.headers.get('Content-Type','').find('xml') == -1:
>             raise "Its not xml!"
>  else:
>     """
>       Read the content and process
>     """
> ...
>
> Is this the good way to read headers ? ,as i do not want the content  
> ,unless
> it is xml.
> Please suggest me,if there are some other good methods to read only the
> headers.

The best way is to issue an HTTP HEAD request -- so only the headers are  
sent. Search this newsgroup for some ways of doing that. Ok, see this  
posts:

http://groups.google.com/group/comp.lang.python/t/b1060b62b2f12a04/
http://groups.google.com/group/comp.lang.python/t/bbac82df3d64d48e/

-- 
Gabriel Genellina