newbie problem: use socke lib to retrieve one web page:

David LeBlanc whisper at oz.net
Thu Sep 5 22:15:56 EDT 2002


And for real convenience, there's websucker.py in the tools/webchecker
directory of the standard distro.

David LeBlanc
Seattle, WA USA

> -----Original Message-----
> From: python-list-admin at python.org
> [mailto:python-list-admin at python.org]On Behalf Of Cameron Laird
> Sent: Thursday, September 05, 2002 17:24
> To: python-list at python.org
> Subject: Re: newbie problem: use socke lib to retrieve one web page:
>
>
> In article <mailman.1031240588.2234.python-list at python.org>,
> Erik Price  <erikprice at mac.com> wrote:
> >
> >On Wednesday, September 4, 2002, at 11:53  PM, koko wrote:
> >
> >> I write this to retrieve one web page using socket lib.
> 			.
> 			.
> 			.
> >Works for me:
> >
> > >>> import socket
> > >>> host = 'www.uic.edu'
> > >>> port = 80
> > >>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> > >>> s.connect((host, port))
> > >>> header = """HEAD /home/events.shtml HTTP/1.0
> >... From: hh at uic.edu
> >... User-Agent: test/1.0
> >...
> >... """
> > >>> s.send(header)
> >72
> > >>> data = s.recv(4096)
> > >>> print data
> >HTTP/1.1 200 OK
> >Date: Thu, 05 Sep 2002 15:40:00 GMT
> >Server: Apache/1.3.26 (Unix) PHP/4.1.2 mod_perl/1.27 mod_ssl/2.8.10
> >OpenSSL/0.9.6
> >Connection: close
> >Content-Type: text/html
> >
> >
> > >>> s.close()
> >
> >
> >I used the HEAD method instead of GET for brevity.  But GET works too.
> 			.
> 			.
> 			.
> To criticize code that's already giving satisfaction
> is nearly beyond me.  I'll point out, though, that
> the original questioner might consider this alterna-
> tive which has, I believe, evident advantages for
> long-term maintenance:
>   from urllib import urlopen
>
>   URL = "http://www.uic.edu"
>   page = urlopen(URL).read()
>   print page
> Note that urllib is one of the batteries the standard
> Python distribution includes.
>
> My summary:  use higher-order facilities when applicable.
> --
>
> Cameron Laird <Cameron at Lairds.com>
> Business:  http://www.Phaseit.net
> Personal:  http://starbase.neosoft.com/~claird/home.html
> --
> http://mail.python.org/mailman/listinfo/python-list





More information about the Python-list mailing list