begin to parse a web page not entirely downloaded

Thu Feb 8 12:54:29 EST 2007

k0mp wrote:
> Is there a way to retrieve a web page and before it is entirely
> downloaded, begin to test if a specific string is present and if yes
> stop the download ?
> I believe that urllib.openurl(url) will retrieve the whole page before
> the program goes to the next statement.

Use urllib.urlopen(), but call .read() with a smallish argument, e.g.:

 >>> foo = urllib.urlopen('http://google.com')
 >>> foo.read(512)
'<html><head> ...

foo.read(512) will return as soon as 512 bytes have been received. You 
can keep caling it until it returns an empty string, indicating that 
there's no more data to be read.