What is the best way to "get" a web page?

Paul McGuire ptmcg at austin.rr._bogus_.com
Sat Sep 23 23:35:33 EDT 2006


"Pete" <harbingerofpeace at post.com> wrote in message 
news:1159068122.933629.128440 at i3g2000cwc.googlegroups.com...
>I have the following code:
>
>>>> web_page = urllib.urlopen("http://www.python.org")
>>>> file = open("temp.html", "w")
>>>> web_page_contents = web_page.read()
>>>> file.write(web_page_contents)
>>>> file.close
> <built-in method close of file object at 0xb7cc76e0>
>>>>
>
> The file "temp.html" is created, but it doesn't look like the page at
> www.python.org. I'm guessing there are multiple frames and my code did
> not get everything. Can anyone point me to a tutorial or other
> reference on how to "get" all of the html contents at a particular
> page?
>
> Why did Python print the line after "file.close"?
>
> Thanks,
> Pete
>

A. You didn't actually invoke the close method, you simply referenced it, 
which is why you got the output line after file.close.  Python is not VB. 
To call close, you have to follow it with ()'s, as in:

file.close()

This will have the added benefit of flushing the output to temp.html, 
probably containing the missing content you were looking for.

B. Don't name variables "file", or "list", "str", "dict", "int", etc.  Doing 
so masks global names of builtin data types.  Try "tempFile" instead.

-- Paul





More information about the Python-list mailing list