[Chicago] threads and xmlrpc?

Fri Jan 30 16:45:47 CET 2009

The 2 connections per host is defined in the HTTP RFC:
http://www.faqs.org/rfcs/rfc2068.html

See section 8.1.4.

The RFC says "should limit 2 connections per server" and a lot of http
client libraries obey this.  I know for a fact that the .NET web client
class does.  I don't know what python does for sure so I'd hate to comment.

This is one of the reasons why a lot of HTTP client libraries implement the
"request" object instances as a factory rather than just instantiate the
class directly:

>>> import urllib2
>>> f = urllib2.urlopen('http://www.python.org/') #Returns a Request object
>>> print f.read(100)

Rather than:
>>> import urllib2
>>> r = urllib2.Request("http://www.python.org")
>>> print r.open().read(100)

The Java and .NET HTTP client libraries I've used all implement it in a
similar way because it's easier to set up stuff like connection limits and
keep-alive.

In any case, from my python web scraping days with httplib2, I found that I
would reduce the number of timeouts and request errors if I waited for 1
second after every request to a particular host.

-Tim Gebhardt
tim at gebhardtcomputing.com

On Thu, Jan 29, 2009 at 10:46 PM, Lukasz Szybalski <szybalski at gmail.com>wrote:

> On Thu, Jan 29, 2009 at 9:02 AM, Tim Gebhardt <tim at gebhardtcomputing.com>
> wrote:
> > If xmlrpc obeys the HTTP standard connection limit, you're limited to 2
> > concurrent connections per host.
>
> Could you point me to some docs on this. What I am comparing it to is
> an apache  server which can handle 100+ requests per second with no
> problems. With Project Gutenberg we are talking about TB of data. With
> Pypi we are talking about <kb per request and maybe about ~3kb per
> second. So I think I should be able to achieve bandwidth of about
> 20kb/s minimum without anybody noticing any performance hits.
>
> I've emailed pypi, but if there are other things to consider, or you
> might know why such a low throughput on xmlrpc I would be interested
> to know more.
>
> Thanks,
> Lucas
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chicago/attachments/20090130/75a87162/attachment.htm>