How to make Python run as fast (or faster) than Julia

Fri Feb 23 07:41:44 EST 2018

On Fri, Feb 23, 2018 at 11:17 PM, bartc <bc at freeuk.com> wrote:
>>> The fact is that the vast majority of integer calculations don't need to
>>> use big integers (pretty much 100% of mine). Probably most don't even
>>> need 64 bits, but 32 bits.
>>
>>
>> And here we have the World According To Bart again: "since *I* don't need
>> more than 32-bits, then obviously it is a FACT than nobody else could
>> need them".
>
>
> Do you have better statistics than me? If so please share. I guess the
> number of calculations (that /can/ be done in 64 bits) is less than 100%,
> but not 0%, so somewhere in between.
>
> I'm saying it's nearer to 100% than to 0%; you're saying, what ..... ?

I don't know what he's saying, but I'm saying that unless it is 100%,
your language should just natively support bigints. How do you know
when you'll need that extra room? Are you going to pick bigints for
everything that's affected by user input? If so, you may as well use
them for everything and not bother with the distinction; if not, some
day, your program will need larger numbers than you guessed for it,
and it'll break.

> As examples of quantities that can be represented in 64 bits:
>
>  Character codes or code points

Sure. That's why we have a special optimization for sequences of
21-bit numbers. We call it "str".

>  Integer pixel values

Maybe in 64 bits for the time being, but 32 certainly won't be enough.
As soon as you do any sort of high DPI image manipulation, you will
exceed 2**32 total pixels in an image (that's just 65536x65536, or
46341x46341 if you're using signed integers); and it wouldn't surprise
me if some image manipulation needs that many on a single side - if
not today, then tomorrow. So 64 bits might not be enough once you
start counting total pixels.

>  Most enumerations
>  For-loop counting (if it needs more than 64 bits, then we might be
>    waiting a while for it to finish)

Most loops don't need any counting. Should we implement a 0-bit
integer for those?

>  Array indexing
>  Numbers of people, cars, guns, countries, houses, books, iphones etc
>    on Earth
>  ... endless examples of numbers less than 9000000000000000000 used
>    in programming

As soon as you say "the number of X is always going to be less than
Y", you open yourself up to risk.

https://xkcd.com/865/

> Most calculations on these can also be done in 64 bits (except multiply and
> power where the ceiling might 32 bits or lower).

Exactly. Most. And then you'll run into some limitation somewhere.

Let's say you want to know how many IP addresses, on average, are
allocated to one person. You'll probably get a single digit number as
your answer. Can you calculate that using 32-bit integers? What about
64-bit? Should you use bignums just in case? Take your pick, let me
know, and then I'll tell you my answer - and why.

>> Yes Bart, the reason that so many languages have support for Bignums is
>> that everyone except you is an idiot who loves writing slow code for
>> absolutely no reason at all.
>
> If bignum support is a handicap, even when they are not needed, then it is
> something to consider.

What sort of handicap is it? Performance? That's what optimization is
for. Some languages (Pike, for instance) have a single data type that
is sometimes represented internally by a native integer and sometimes
by a bignum; it's completely transparent apart from with timings.
Others (eg Python 2) have two separate data types, but will smoothly
transition to bignum any time it's needed. IMO CPython 3.x could
probably benefit from the transparent optimization... but Python (the
language) benefits hugely from the core int type simply being a
bignum. There are no semantically significant effects at the boundary
between "small numbers" and "large numbers".

Auto-growing lists are a cost, too. It's expensive to have to allocate
more memory to allow another element to be added. Should we lock in
the sizes of all lists/arrays at creation? Why should we pay the price
for that flexibility? Or maybe, in Bartville, this flexibility is
worth paying for but large integers are useless?

ChrisA