begin to parse a web page not entirely downloaded

Thu Feb 8 14:02:09 EST 2007

k0mp wrote:
> It seems to take more time when I use read(size) than just read.
> I think in both case urllib.openurl retrieve the whole page.

Google's home page is very small, so it's not really a great test of 
that. Here's a test downloading the first 512 bytes of an Ubuntu ISO 
(beware of wrap):

$ python -m timeit -n1 -r1 "import urllib" 
"urllib.urlopen('http://ubuntu.cs.utah.edu/releases/6.06/ubuntu-6.06.1-desktop-i386.iso').read(512)"
1 loops, best of 1: 596 msec per loop