begin to parse a web page not entirely downloaded

k0mp Michel.Al1 at gmail.com
Thu Feb 8 14:48:07 EST 2007


On Feb 8, 8:02 pm, Leif K-Brooks <eurl... at ecritters.biz> wrote:
> k0mp wrote:
> > It seems to take more time when I use read(size) than just read.
> > I think in both case urllib.openurl retrieve the whole page.
>
> Google's home page is very small, so it's not really a great test of
> that. Here's a test downloading the first 512 bytes of an Ubuntu ISO
> (beware of wrap):
>
> $ python -m timeit -n1 -r1 "import urllib"
> "urllib.urlopen('http://ubuntu.cs.utah.edu/releases/6.06/ubuntu-6.06.1-desktop-i386.is...)"
> 1 loops, best of 1: 596 msec per loop

OK, you convince me. The fact that I haven't got better results in my
test with read(512) must be because what takes most of the time is the
response time of the server, not the data transfer on the network.




More information about the Python-list mailing list