Pyhon 2.x or 3.x, which is faster?

Mon Mar 7 15:47:09 EST 2016

On Tue, Mar 8, 2016 at 7:19 AM, BartC <bc at freeuk.com> wrote:
>>> I disagree. The program does its job perfectly (you will have to take it
>>> further to verify the results, such as writing out the .ppm file and
>>> viewing
>>> the contents).
>>>
>>> But Py3 is slower doing this task than Py2 by 10 or 20% (or even 30% for
>>> a
>>> smaller file). /This is in line with other observations./
>>
>>
>> What's your meaning of "perfectly"? You're implementing things in a
>> very C way, and then showing that two different Python interpreters
>> have different performance.
>
>
> Two interpreters executing exactly the same code and exactly the same
> algorithm. And for a real task which deliberately doesn't just delegate to
> high-level features (otherwise you're just comparing libraries).
>
> What can be more perfect for comparing two implementations?

def valueOf(digit):
    return ord(digit) - ord('0')

class Float:
    def __init__(self, string):
        self.value = 0
        self.scale = 0
        have_dot = 0
        for i in range(len(string)):
            if string[i] == '.': have_dot = 1
            else:
                self.value = self.value * 10 + valueOf(string[i])
                if have_dot: self.scale += 1

rosuav at sikorsky:~$ python2 -m timeit -s 'from fp import Float'
'Float("1234.567")'
1000000 loops, best of 3: 1.84 usec per loop
rosuav at sikorsky:~$ python3 -m timeit -s 'from fp import Float'
'Float("1234.567")'
100000 loops, best of 3: 2.76 usec per loop

Look! Creating a floating-point value is faster under Python 2 than
Python 3. What could be more perfect?

This is like a microbenchmark in that it doesn't tell you anything
about real-world usage. Your example also has the problem that it does
file I/O, which adds a ton of noise to your numbers. When you
reimplement a C-designed algorithm in Python, it's often NOT going to
be the best use of the language; and if you're comparing two poor uses
of a language, what do you prove by showing that one interpreter is
faster than another?

>> Did you try the 'pillow' library?
>>
>> https://pypi.python.org/pypi/Pillow
>>
>> It supports 2.6/2.7 and 3.2+, and would be most people's "one obvious
>> way" to do things. At very least, it should get a mention in your
>> performance comparisons.
>
>
> I don't understand. This is an external library that appears to be written
> in C. How does that help me compare the performance of two implementations
> of Python?

I'm not sure that it is written in C, but the point is that this is a
much better way to compare code. Use the language well, not badly.

> One could be ten times slower than the other (in executing its native
> bytecode), but if it spends 99% of its time an a C image library, how do you
> measure that?

You write *real world* code and then profile that. You get actual real
programs that you actually really use, and you run those through
timing harnesses. (Or throughput harnesses, which are pretty much the
same thing. With a web application, "performance" doesn't so much mean
"how long it takes to run", but "how many requests per second the
system can handle". Same thing, opposite way of measuring it.)

>>> (I'm quite pleased with my version: smaller, faster, works on all the
>>> Pythons, supports all 3 colour formats and no decoding bugs that I'm
>>> aware
>>> of, and it's the first Python program I've written that does something
>>> useful.)
>>
>>
>> "all the Pythons" meaning what, exactly?
>
>
> I means versions 2 and 3 of the language, tested on CPython versions 2.5,
> 2.7, 3.1 and 3.4, as well as PyPy. The other decoders I tried were for 2.x.

CPython 2.5 and 2.7 are very different. Even 2.7.0 and 2.7.11 are
going to differ in performance. And that's without even looking at
what core devs would refer to as "other Pythons", which would include
IronPython, Jython, PyPy (well, you got that, but you're treating it
as an afterthought), MicroPython, Brython, wpython, Nuitka,
Cython..... these are *other Pythons*. What you're looking at is
closer to *all the versions of CPython*, but not even that, since
there are so many different ways that they can be built, and the
performance of array.array() is far from stable. It's not a core
language feature; you're using it because it's the most obvious
translation of your C algorithm, not because it's the best way to use
Python. So I say again, your measurement has little to no significance
to real-world code.

ChrisA