Pyhon 2.x or 3.x, which is faster?

Mon Mar 7 21:12:34 EST 2016

On Tue, 8 Mar 2016 09:39 am, BartC wrote:

> On 07/03/2016 20:47, Chris Angelico wrote:
>> On Tue, Mar 8, 2016 at 7:19 AM, BartC <bc at freeuk.com> wrote:
> 
>>> What can be more perfect for comparing two implementations?
> 
>> rosuav at sikorsky:~$ python2 -m timeit -s 'from fp import Float'
>> 'Float("1234.567")'
>> 1000000 loops, best of 3: 1.84 usec per loop
>> rosuav at sikorsky:~$ python3 -m timeit -s 'from fp import Float'
>> 'Float("1234.567")'
>> 100000 loops, best of 3: 2.76 usec per loop
>>
>> Look! Creating a floating-point value is faster under Python 2 than
>> Python 3. What could be more perfect?
> 
>> This is like a microbenchmark in that it doesn't tell you anything
>> about real-world usage.
> 
> Microbenchmarks have their uses, when you are concentrating on a
> particular aspect of a language. But I tried your code, calling Float a
> million times, and 2.7 took 8.3 seconds; 3.4 took 10.5 seconds. But that
> is meaningless according to you.

I think Chris is looking at this as a fair test of Python's speed. Nobody in
their right mind would use his pure-Python Float when built-in floats
exist.

But I think Chris is wrong. He may remember that Python gained a Decimal
class written in pure Python. Does he think that it is invalid to ask how
fast Decimal is? Surely not. Creating millions of Decimal instances might
not be the single biggest bottleneck slowing your code down, but I'm pretty
sure we would want that to be as fast as possible.

(In fact, in Python 3.4 [by memory], Decimal was re-written in C for speed.)

Anyway, just for comparisons sake, I ran the same micro-benchmark:

[steve at ando ~]$ python2.6 -m timeit -s 'from fp import
Float' 'Float("1234.567")'
100000 loops, best of 3: 11.1 usec per loop

[steve at ando ~]$ python2.7 -m timeit -s 'from fp import
Float' 'Float("1234.567")'
100000 loops, best of 3: 12.1 usec per loop

[steve at ando ~]$ python3.3 -m timeit -s 'from fp import
Float' 'Float("1234.567")'
100000 loops, best of 3: 13.6 usec per loop

[steve at ando ~]$ python/python-dev/3.5/python -m timeit -s 'from fp import
Float' 'Float("1234.567")'
10000 loops, best of 3: 54 usec per loop

So you can see on my computer, there is an apparent slowdown between 2.6 and
2.7, and 2.7 and 3.3. I say "apparent" because it may not be statistically
meaningful: time results are notoriously fickle. I ran 2.7 again three
times, and got three faster results:

100000 loops, best of 3: 11.8 usec per loop
100000 loops, best of 3: 11.7 usec per loop
100000 loops, best of 3: 11.6 usec per loop

I'm pretty sure that the "11.8 .7 .6" pattern is just a coincidence, because
I ran it again:

100000 loops, best of 3: 12.4 usec per loop

So there is considerable variation in timing results.

What about 3.5? That's four times slower than 3.3? What happened there?

Simple: I compiled 3.3 with all the debugging code turned on.

> (3.1 took 8.5 seconds. What happened between 3.1 and 3.4 to cause such a
> slow-down? Do you think it might be a good idea for /someone/ at least
> to take pay some attention to that, before it grinds to a halt
> completely by version 4.0?)

This is the whole point of the speed.python.org site, to monitor and
benchmark the speed of the language.

[...]
>> CPython 2.5 and 2.7 are very different. Even 2.7.0 and 2.7.11 are
>> going to differ in performance. And that's without even looking at
>> what core devs would refer to as "other Pythons", which would include
>> IronPython, Jython, PyPy (well, you got that, but you're treating it
>> as an afterthought), MicroPython, Brython, wpython, Nuitka,
>> Cython..... these are *other Pythons*.
> 
> What are you suggesting here? That all these Pythons are going to be
> faster or slower than each other? I would guess that most of them are
> going to be roughly the same, other than PyPy. If there was a fast
> version, then I would have heard about it!

I think Chris exaggerates the performance differences between bug fix
releases, at least in general. There's unlikely to be a major change in the
interpreter applied to (say) 2.7.11 that would speed it up drastically
compared to 2.7.0. But there may be a series of small performance
enhancements which, *together*, add up to a moderate increase in overall
speed (while being all but invisible to sufficiently small
micro-benchmarks.

But typically, it is very, very hard to definitively say that one version or
implementation of Python is faster than another. That will often depend on
what you're trying to do! But, with lots of hand-waving and a certain
amount of trepidation, I'd like to offer this as the "conventional wisdom"
for the speed of pure-Python on a semi-arbitrary set of benchmarks which
may or may not reflect actual use by anyone:

# slowest
jython
cpython + stackless
ironpython
pypy
# fastest

>> the
>> performance of array.array() is far from stable. It's not a core
>> language feature; you're using it because it's the most obvious
>> translation of your C algorithm,
> 
> I'm using it because this kind of file reading in Python is a mess. If I
> do a read, will I get a string, a byte sequence object, a byte-array, or
> array-array, or what?

Calling it "a mess" is an exaggeration. There is a change between Python 2
and 3:

- in Python 2, reading from a file gives you bytes, that is, the
so-called "str" type, not unicode;

- in Python 3, reading from a file in binary mode gives you bytes, that is,
the "bytes" type; reading in text mode gives you a string, the "str" type.

How is this a mess?

-- 
Steven