why python is slower than java?

Alex Martelli aleaxit at yahoo.com
Sat Nov 6 17:17:54 EST 2004


Kent Johnson <kent3737 at yahoo.com> wrote:

> I rarely find myself acting as an apologist for Java, and I understand
> that the point Alex is making is that Python's performance for this 
> operation is quite good, and that the OP should post some code, but this
> is really too unfair a comparison for me not to say something.

I'm glad I posted a sufficiently silly comparison to elicit some
response, then;-)


> There are two major differences between these two programs:
> - The Java version is doing a character by character copy; the Python
> program reads the entire file into a buffer in one operation.
> - The Java program is converting the entire file to and from Unicode;
> the Python program is copying the literal bytes.

Right: each is using the respective language's defaults, and Python's
are apparently tuned for speed, while Java's apparenty aren't.


> Here is a much more comparable Java program (that will fail if the file
> size is over 2^31-1):

I believe that the Python program, if run on a suitable 64-bit OS, on a
ridiculously-large-memory machine, could succeed for much larger files
than that.  Machines with many gigabytes of physical RAM are becoming
far from absurd these days -- I can't yet afford to throw 2500 euros out
of the window to buy 8 GB of fast RAM, but if I could it would fit
snugly into my own cheap dual-G5 powermac, for example (old model: I got
it used/reconditioned many months ago; I think the current cheaper model
does top out at 4 GB).  On 32-bit machines or OSs, of course, Python's
memory limits _will_ byte at fewer GB than that, too.


> import java.io.*;
> 
> public class Copy {
>      public static void main(String[] args) throws IOException {
>          File inputFile = new File("/usr/share/dict/web2");
>          int bufferSize = (int)inputFile.length();
>          File outputFile = new File("/tmp/acopy");
> 
>          FileInputStream in = new FileInputStream(inputFile);
>          FileOutputStream out = new FileOutputStream(outputFile);
> 
>          byte buffer[] = new byte[bufferSize];
>          int len=bufferSize;
> 
>          while (true)
>          {
>              len=in.read(buffer,0,bufferSize);
>              if (len<0 )
>                  break;
>              out.write(buffer,0,len);
>          }
> 
>          in.close();
>          out.close();
>      }
> }
> 
> Here are the results I get with this program and Alex's Python program
> on my G4-400 Mac:
> kent% time java Copy
> 0.440u 0.320s 0:00.96 79.1%     0+0k 9+3io 0pf+0w
> kent% time python Copy.py
> 0.100u 0.120s 0:00.31 70.9%     0+0k 2+4io 0pf+0w

With your program, and mine simply converted to run inside a main()
function rather than at module-level for comparison, switching to tcsh
for direct comparison with the format you're using, I see:

[kallisti:~] alex% time java Copy
0.200u 0.140s 0:00.54 62.9%     0+0k 0+0io 0pf+0w
[kallisti:~] alex% time python Copy.py
0.080u 0.020s 0:00.13 76.9%     0+0k 0+1io 0pf+0w

Which python and java versions are you using?  I'm trying to use the
latest and greatest of each, 2.4b1 (I know, I know, I need to install
b2!) for Python, 1.4.2_05 for Java -- just upgraded to MacOSX 10.3.6,
and my /usr/share/dict/web2 is 2486825 bytes.

> The Python program is still substantially faster (3x), but with nowhere
> near the margin Alex saw.

I still see a 4:1 ratio, but, sure, nowhere like the 20:1 my originally
silly example showed.  Maybe a more realistic program would use a buffer
of some fixed length, rather than 'as long as the whole file'.  Say:

         int bufferSize = 64 * 1024;

in a program that otherwise is just like yours, for the Java side of
things.  Switching back to bash because I can't stand long exposures to
anything in the csh family;-), I see:

kallisti:~ alex$ time java Copy

real    0m0.521s
user    0m0.200s
sys     0m0.120s

after several runs, so everything gets a chance to go to cache.  Oh,
btw, I did compile with 'javac -O Copy.java'.

The closest Python equivalent I know how to write is:

def main():
    inputFile = file("/usr/share/dict/web2", 'r')
    bufferSize = 64 * 1024
    outputFile = file("/tmp/acopy", 'w')

    inf = inputFile.read
    ouf = outputFile.write

    while 1:
        buf = inf(bufferSize)
        if not buf: break
        ouf(buf)

    inputFile.close()
    outputFile.close()

and its performance:

kallisti:~ alex$ time python -O Copy.py

real    0m0.135s
user    0m0.050s
sys     0m0.050s

so we still see the 4:1 ratio in favour of Python.  It's barely more
than 3:1 in actual CPU time, user and sys, but for some reason Python
seems to be able to get more of the CPU's % attention -- I don't claim
to understand this!-)

So I scp'd everything over to the powermac dual g5 1.8 GHz, and ssh'd
there to do more measurements -- no python 2.4 there (it's my production
machine -> no betas!) and again I measured after a few runs to let stuff
get into cache:

macarthur:~ alex$ time java Copy

real    0m0.163s
user    0m0.040s
sys     0m0.110s

macarthur:~ alex$ time python -O Copy.py

real    0m0.039s
user    0m0.020s
sys     0m0.020s

Far better real/cpu ratios of course (a dual-CPU machine does that;-).
And faster overall.  But roughly the same 4:1 ratio in favour of Python.


So, same thing on the oldie but goodie Linux box.  In that case I don't
have an updated Java -- Kaffe 1.0.7 will have to do!  The CPU is oldish
(Athlon 1.2G), but the disk subsystem is really really good.  So I got
(again, good repeatable results after a few runs to stabilize):

[alex at lancelot alex]$ time java Copy
0.07user 0.04system 0:00.11elapsed 98%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (1623major+659minor)pagefaults 0swaps

[alex at lancelot alex]$ time python2.4 -O Copy.py
0.02user 0.01system 0:00.02elapsed 115%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (460major+241minor)pagefaults 0swaps

don't ask ME how python got over 100% CPU on a single-CPU machine; I
guess the task is just too small, at such tiny fractions of a second, to
avoid anomalies.  Still, the roughly 4:1 performance ratio appears to be
repeatable here; and an interesting clue is that _pagefaults_ also
appear to be in roughly 4:1 ratio.

But clearly we need bigger tasks to make measurements less fraught with
anomalies.  Unfortunately, not knowing on exactly which tasks the OP had
observed and measured his reported results that "Python is slower than
Java" in "a disk intensive I/O operation", it's hard to guess.
Obviously, his results are anything but _trivial_ to reproduce based on
the dearth of information that he has released about them so far; I find
it hard to think of something more disk-intensive than a simple copy of
one file to another.  Assuming he does have some interest in
ascertaining the truth and getting his questions answered, perhaps he'll
deign to give us Java and Python sources, and reference data file[s],
for a small benchmark that _does_ reproduce his results.  Otherwise, it
appears we may be just shooting in the dark.


Alex



More information about the Python-list mailing list