Built-in datatypes speed

Thu Feb 8 21:17:55 EST 2007

On Feb 7, 2:34 am, Maël Benjamin Mettler <mbmett... at access.unizh.ch>
wrote:
> Anyway, I reimplemented parts of TigerSearch (http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/) in Python.
> I am currently writing the paper that goes along with this
> reimplementation. Part of the paper deals with the
> differences/similarities in the original Java implementation and my
> reimplementation. In order to superficially evaluate differences in
> speed, I used this paper (http://www.ubka.uni-karlsruhe.de/cgi-bin/psview?document=ira/2000/5&f...
> ) as a reference. Now, this is not about speed differences between Java
> and Python, mind you, but about the speed built-in datatypes
> (dictionaries, lists etc.) run at. As far as I understood it from the
> articles and books I read, any method call from these objects run nearly
> at C-speed (I use this due to lack of a better term), since these parts
> are implemented in C. Now the question is:
>
> a) Is this true?
> b) Is there a correct term for C-speed and what is it?

I think the statement is highly misleading.  It is true that most of
the underlying operations on native data types are implemented in c.
If the operations themselves are expensive, they could run close to
the speed of a suitably generic c implementation of, say, a
hashtable.  But with richer data types, you run good chances of
landing back in pythonland, e.g. via __hash__, __equals__, etc.

Also, method dispatch to c is relatively slow.  A loop such as:

lst = []
for i in xrange(int(10e6)):
    lst.append(i)

will spend most of its time in method dispatch and iterating, and very
little in the "guts" of append().

Those guts, mind, will be quick.

-Mike