Huge performance gain compared to perl while loading a text file in a list ...!?

Marc H. coolroot at gmail.com
Sun Mar 13 00:45:38 EST 2005


Hello,

I recently converted one of my perl scripts to python. What the script
does is simply search a lot of big mail files (~40MB) to retrieve
specific emails. I simply converted the script line by line to python,
keeping the algorithms & functions as they were in perl (no
optimization). The purpose was mainly to learn python and see the
differences with perl.

Now, once the converted script was finished, I was amazed to find that
the python version is running 8 times faster (800% faster!). Needless
to say, I was very intrigued and wanted to know what causes such a
performance gap between the two versions. So to keep my story short,
after some research and a few tests, I found that file IO is mainly
the cause of the performance diff.

I made two short test scripts, one in perl and one in python (see
below), and compared the performance difference. As we can see, the
bigger the file the larger the difference in performance....

I'm fairly new to python, and don't know much of its inner working so
I wonder if someone could explain to me why it is so much faster in
python to open a file and load it in a list/array ?

Thanks


-----
#!/usr/bin/python

for i in range(20):
    Data = open('data.test').readlines()

-----
#!/usr/bin/perl

for ($i = 0; $i < 20; $i++) {
    open(DATA, "data.test");
    @Data = <DATA>;
    close(DATA);
}

-----
Running tests (data.test = 10MB text file):

blop at moya blop $ time ./ftest.py
real    0m6.408s
user    0m4.552s
sys     0m1.826s

blop at moya blop $ time ./ftest.pl
real    0m22.855s
user    0m21.946s
sys     0m0.822s

-----
Running tests (data.test = 40MB text file):

blop at moya blop $ time ./ftest.py
real    0m26.235s
user    0m18.238s
sys     0m7.872s

blop at moya blop $ time ./ftest.pl
real    3m26.741s
user    3m22.168s
sys     0m3.764s



More information about the Python-list mailing list