web crawler in python or C?

Fuzzyman fuzzyman at gmail.com
Thu Feb 16 03:29:15 EST 2006


abhinav wrote:
> It is DSL broadband 128kbps.But thats not the point.What i am saying is
> that would python be fine for implementing fast crawler algorithms or
> should i use C.

But a web crawler is going to be *mainly* I/O bound - so language
efficiency won't be the main issue. There are several web crawler
implemented in Python.

> Handling huge data,multithreading,file
> handling,heuristics for ranking,and maintaining huge data
> structures.What should be the language so as not to compromise that
> much on speed.What is the performance of python based crawlers vs C
> based crawlers.Should I use both the languages(partly C and python).How

If your data processing requirements are fairly heavy you will
*probably* get a speed advantage coding them in C and accessing them
from Python.

The usdual advice (which seems to be applicable to you), is to
prototype in Python (which will be much more fun than in C) then test.

Profile to find your real bottlenecks (if the Python one isn't fast
enough - which it may be), and move your bottlenecks to C.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

> should i decide what part to be implemented in C and what should be
> done in python?
> Please guide me.Thanks.




More information about the Python-list mailing list