Async Client with 1K connections?

Paul Rubin http
Wed Feb 11 13:35:15 EST 2004


"William Chang" <williamichang at hotmail.com> writes:
> Thank you all for the discussion!  Some additional information:
> 
> One of the intended uses is indeed a next-gen web spider.  I did the
> math, and yes I will need about 10 cutting-edge PCs to spider like
> you-know-who.  But I shouldn't need 100 -- and would rather not
> spend money unnecessarily...  Throughput per PC would be on
> the order of 1MB/s assuming 200x5KB downloads/sec using 1-2000
> simultaneous connections.  (That's 17M pages per day per PC.)

That's orders of magnitude less than you-know-who.  Also, don't forget
how many queries you have to take from users, and the amount of disk seeks
needed for each one.

> Nevertheless, it shouldn't cost millions.  Maybe $100K :-)

10 MB of internet connectivity is at least a few K$/month all by itself.



More information about the Python-list mailing list