urllib so slow

Andrew MacIntyre andymac at bullseye.apana.org.au
Mon Feb 10 05:17:49 EST 2003


[posted & mailed]

On Sat, 8 Feb 2003, Paul Nilsson wrote:

> On Sat, 8 Feb 2003 14:43:15 -0600, an infinite amount of monkeys
> hijacked the computer of Skip Montanaro <skip at pobox.com> and wrote:
>
> >
> >    Paul> Can someone tell me why urllib is so slow? The code below takes
> >    Paul> over 12 seconds to execute just for the google webpage!
> >
> >Perhaps something platform dependent or you just hit a slow combination of
> >google server and/or network congestion?  On my Mac OS X system conencted
> >via cable modem I get respectable results:
>
> I get the same result loading pages from my linux box over the 100MB
> connection so I don't think it cn be a networking problem. Perhaps it
> is a win98 specific problem.

Some time ago this came up in relation to FreeBSD.  urllib uses httplib,
which uses an unbuffered socket so that various file descriptor
manipulations can be done.

Unbuffered file I/O in multithreaded apps in heavily dependant on the
implementation of the reentrant C library.

I don't know whether or not this is your problem, but I recall someone in
the earlier thread noting that Win9x (maybe Win2k?) appeared to be
performing suboptimally compared to Linux, but a little better than
FreeBSD.

I do plan to work up a patch that would allow urllib to get a buffered
file back from httplib, but haven't got to it yet :-(

--
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  | Snail: PO Box 370
        andymac at pcug.org.au            |        Belconnen  ACT  2616
Web:    http://www.andymac.org/        |        Australia






More information about the Python-list mailing list