flaming vs accuracy [was Re: Performance of int/long in Python 3]

Thu Mar 28 13:13:24 EDT 2013

On Fri, Mar 29, 2013 at 3:55 AM, jmfauth <wxjmfauth at gmail.com> wrote:
> Assume you have a set of integers {0...9} and an operator,
> let say, the addition.
>
> Idea.
> Just devide this set in two chunks, {0...4} and {5...9}
> and work hardly to optimize the addition of 2 operands in
> the sets {0...4}.
>
> The problems.
> - When optimizing "{0...4}", your algorithm will most probably
> weaken "{5...9}".
> - When using "{5...9}", you do not benefit from your algorithm, you
> will be penalized just by the fact you has optimized "{0...4}"
> - And the first mistake, you are just penalized and impacted by the
> fact you have to select in which subset you operands are when
> working with "{0...9}".
>
> Very interestingly, working with the representation (bytes) of
> these integers will not help. You have to consider conceptually
> {0..9} as numbers.

Yeah, and there's an easy representation of those numbers. But let's
look at Python's representations of integers. I have a sneaking
suspicion something takes note of how large the number is before
deciding how to represent it. Look!

>>> sys.getsizeof(1)
14
>>> sys.getsizeof(1<<2)
14
>>> sys.getsizeof(1<<4)
14
>>> sys.getsizeof(1<<8)
14
>>> sys.getsizeof(1<<31)
18
>>> sys.getsizeof(1<<30)
18
>>> sys.getsizeof(1<<16)
16
>>> sys.getsizeof(1<<12345)
1660
>>> sys.getsizeof(1<<123456)
16474

Small numbers are represented more compactly than large ones! And it's
not like in REXX, where all numbers are stored as strings.

Go fork CPython and make the changes you suggest. Then run real-world
code on it and see how it performs. Or at very least, run plausible
benchmarks like the strings benchmark from the standard tests.

My original post about integers was based on two comparisons: Python 2
and Pike. Both languages have an optimization for "small" integers
(where "small" is "within machine word" - on rechecking some of my
stats, I find that I perhaps should have used a larger offset, as the
64-bit Linux Python I used appeared to be a lot faster than it should
have been), which Python 3 doesn't have. Real examples, real
statistics, real discussion. (I didn't include Pike stats in what I
posted, for two reasons: firstly, it would require a reworking of the
code, rather than simply "run this under both interpreters"; and
secondly, Pike performance is completely different from CPython
performance, and is non-comparable. Pike is more similar to PyPy, able
to compile - in certain circumstances - to machine code. So the
comparisons were Py2 vs Py3.)

ChrisA