Getting the source of a remote webpage

Grant Edwards grante at visi.com
Wed Jun 23 23:51:39 EDT 2004


In article <pan.2004.06.24.03.16.49.294451 at newsfeeds.com>, Nugget wrote:

>> Is it possible to get the source of a remote webpage (just the
>> straight HTML) without anything too complicated (like COM, I
>> just want to write a simple script if possible) and write it
>> to a file? For example, is it possible to do something like
>> os.system(....run iexplorer.exe command....) with another
>> command that automatically opens the page? Thanks.
> 
> import urllib
> 
> f = urllib.urlopen(url)
> s = site.read()

least we lead the gentle reader astray...

>>> import urllib
>>> f = urllib.urlopen('http://www.visi.com')
>>> s = site.read()

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'site' is not defined

What you meant was:

>>> s = f.read()
>>> print s
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>
[...]

-- 
Grant Edwards                   grante             Yow!  ... A housewife
                                  at               is wearing a polypyrene
                               visi.com            jumpsuit!!



More information about the Python-list mailing list