check if an URL exists without opening it

David LeBlanc whisper at oz.net
Fri Aug 9 19:38:57 EDT 2002


I think you want the http "head" command which is expressly intended to get
the date and other information about a page (headers) in lieu of actually
downloading all of it.

Actually, I think the urllib.info() call uses head after a
urllib.urlopen(url) call.

import urllib

site = urllib.urlopen("http://python.org")
meta = site.info()
print meta

Date: Fri, 09 Aug 2002 23:33:55 GMT
Server: Apache/1.3.26 (Unix)
Last-Modified: Wed, 07 Aug 2002 03:54:08 GMT
ETag: "5a750f-3309-3d5099e0"
Accept-Ranges: bytes
Content-Length: 13065
Connection: close
Content-Type: text/html

See the pythondoc on urllib for more.

David LeBlanc
Seattle, WA USA

> -----Original Message-----
> From: python-list-admin at python.org
> [mailto:python-list-admin at python.org]On Behalf Of David
> Sent: Friday, August 09, 2002 14:10
> To: python-list at python.org
> Subject: Re: check if an URL exists without opening it
>
>
> Thanks for your answer.
> But how GetRight or Gozilla work or Download Accelarator plus ?
>
> It seems to me that they check the existence before beginning the
> download.
>
> "Michael Chermside" <mcherm at destiny.com> wrote in message
> news:mailman.1028922395.8815.python-list at python.org...
> > > I would like to check if an URL exists.
> > > (for instance http://www.yahoo.com/try.pdf)
> > >
> > > The method urllib.open is unsatisfactory because the URL
> (which will be
> a
> > > file in my program) is opened ! So it can take too long time, just to
> check
> > > the existence !
> >
> > Unfortunately, this is impossible for many kinds of URLs. http: urls,
> > for instance, can only be detected by an attempt to download them.
> >
> > You COULD try opening the URL and then abandoning the download as soon
> > as you get the first few bytes of the content, but I wouldn't advise it.
> > A huge amount of the overhead is in creating the TCP/IP connection and
> > sending HTTP headers.. if there's a reasonable chance you'll want to
> > download the contents of the URL I'd go ahead and do it.
> >
> > -- Michael Chermside
> >
> >
> >
> >
> >
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list





More information about the Python-list mailing list