python disk i/o speed

nnes pruebauno at latinmail.com
Wed Aug 7 10:21:28 EDT 2002


Hello all,

since I wanted to test how fast python is in processing delimited text
files, something common here, I wrote a quick and dirty mockup of such
situation.

I generated a file about 7MB long, with 3 numbers on each line. Then I
wrote a programm in python, java and ANSI C, generating a second file
based on the first one, with 4 numbers; the original 3 plus the sum of
these.
e.g. "2","5","1" ----> "2","5","1","8"

I recorded the time it took for the programms to get back to the
command prompt after hitting the enter key, using my
"bogo-wristwach-timer" system.

The results I got where:

gcc:        4 seconds
Suns java: 19 seconds
Python:    36 seconds

I wondered about the reason of almost 10 times the difference from c
to python since the programms should be mostly I/O bound and not CPU
bound. Is there also a way of improving the speed for python in this
situation? If sombody wants to make comments on the c and the java
code it would be ok also, since I am not an expert programmer.

The tests where performed on Windows2000 Profesional with service pack
2.

Versions of the languages where:

DJGPP gcc: 

>gcc -dumpversion
3.1

Suns java-sdk for windows:

>java -version
java version "1.4.0_01"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0_01-b03)
Java HotSpot(TM) Client VM (build 1.4.0_01-b03, mixed mode)

Python:

>python
Python 2.2.1 (#34, Apr  9 2002, 19:34:33) [MSC 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> 

I will post the source of the programms in a follup post. The
implementation details are not exactly the same. I spent a couple of
days on a satisfactory ANSI C version for example and about 20 minutes
on the python script. :-)

Nestor



More information about the Python-list mailing list