why python is slower than java?

Alex Martelli aleaxit at yahoo.com
Sun Nov 7 03:10:34 EST 2004


Roger Binns <rogerb at rogerbinns.com> wrote:
   ...
> I/O is one area where Java is *very* different than other environments.
> Java emphasises the ability to select at runtime (typically from

Python's dynamism at runtime is definitely one of its fortes.

> Note how in Python, files do read/write whereas sockets do
> send/recv.

Sure, since the semantics are different.  Which is why we have the
makefile method of socket objects: for those occasions in which you want
signature-polymorphism at the expense of the overhead of adaptation.

> But you are comparing Apples to Oranges.  Java programs are written
> in certain ways with various emphases (flexibility, interfaces and
> factories).  Python programs emphasise other things (generators,
> typing).  Exceptions are expensive in Java and not used much.  They
> are "cheap" in Python and used frequently.

Generators are a reasonably recent addition to Python, and I have no
idea what you mean by stating that Python emphasizes typing more than
Java does -- I typically see far more mention of the types of everything
in Java code than in Python, where the usual duck-typing generally
obviates the need.  Python's flexibility is definitely one of its
fortes, and while duck typing obviates much of the need for explicit
interfaces, nevertheless they're getting popular in large Python
frameworks.  And factory-based design patterns are everywhere in Python,
of course; indeed, it's in Java that you see lots of 'new ThisClass'
constructs which build an instance of some hardwired concrete class --
in Python, instantiation is generally by calling, which makes it
trivially easy to arrange for a function, rather than a class, to be
called, getting factory-effect.  Do not underestimate the flexibility
you get by classes and functions being first-class objects, passed as
arguments as easily as any other, ready to call for instantiation...

In other words, I do not consider your observations to be at all well
founded; the one about exceptions is rather inapplicable to the example
codes posted so far.  If, as the OP claimed, Python is slower than Java
for disk-intensive programs, this should be easy for him to show, and I
have not seen it shown yet.


> >> The Python code is reading the entire string into memory and
> >> then writing it.
> >
> > Right, _Python_'s default.
> 
> Arguably Python's default is reading a line at a time, and it

Not for the somefile.read method -- read doesn't do lines.  You may be
thinking of iter(somefile).next instead, and I'm not sure what the Java
equivalent of _that_ one is.

> is a bad default in some circumstances (large files), just
> as the Java code was a bad way of doing anything but small
> files.

Python's default makes it trivially easy to read most files in a single
gulp, so it's appropriate in many cases; Java's makes it hard and slow
to read ANY file, so it's never appropriate.

> > The claim posted to this newsgroup, without any support nor examples
> > being given, was that Python's I/O was far slower than Java's in
> > _disk-intensive_ operations.  I'm still waiting to see any small,
> > verifiable examples of that posted on this thead.
> 
> If the language code is the same, then that claim boils down to
> the Java Native Interface vs the Python C API.  In the case of
> Java, I can see the claim having some relevance in multi-threaded
> code since Java doesn't have the GIL.

Python does, but drops it during blocking I/O operations so that the
relevance should be just about the same in both cases.


> > defaults are tuned, making Python much faster.  Great, then let those
> > who claim Java's I/O is much faster in disk intensive operation post
> > suitable examples, and we'll see.
> 
> Your timing included the overhead of starting up and shutting down
> both environments, making the rest of the measure less than
> interesting.

If the OP intended their claim to apply only to long-running programs
where the difference in environment startup/shutdown gets fully
amortized, they *MIGHT* have deigned to mention the fact.  I have seen
no such claim yet, nor as yet ANY benchmark posted that purports to
prove anything related to the original claim.

I did observe (at some point along the substantial chain of small
benchmarks I and some other posters exchanged) that the 4:1 ratio in
runtime in favour of Python exactly matched the 4:1 ratio in pagefaults,
again in favour of Python, btw.  I guess the startup/shutdown costs can
be amortized by simply looping over the filecopy operation N times.


> The idea isn't to emphasise the open source side, but rather so
> that anyone can see for themselves how it was all put together.
> If I have a business critical app, and claim it is written in
> Python but noone can see the insides then they can't really
> know too much.  The biggest thing is that they can't tell
> if they could write code like that (or even how much was written)
> to produce an app of similar functionality and complexity.

However, firms that choose not to release their business critical
applications as open source are likely to require at the very least a
non-disclosure agreement before they show you those sources, making it
impractical to use those sources to meet your wishes.

Note that even GPL would be no use here: if you do not _distribute_ your
programs, but keep them for in-house use, you keep the option to not
show anybody those programs' sources even when GPL applies.  Therefore,
the problem applies equally to all languages, even a hypothetically
GPL'd one (not that I know of any GPL-covered language in common use for
writing business critical apps).

As for your latter sentence, I've never met a programmer whose default
assumption was that they would NOT be able to write code just as good as
most anybody else's.


Alex



More information about the Python-list mailing list