random.SystemRandom().randint() inefficient

Tue Jul 26 18:44:01 EDT 2022

On Wed, 27 Jul 2022 at 08:18, Cecil Westerhof via Python-list
<python-list at python.org> wrote:
>
> Chris Angelico <rosuav at gmail.com> writes:
>
> > On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
> > <python-list at python.org> wrote:
> >>
> >> Chris Angelico <rosuav at gmail.com> writes:
> >>
> >> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
> >> > <python-list at python.org> wrote:
> >> >>
> >> >> I need to get a random integer. At first I tried it with:
> >> >>     from secrets import randbelow
> >> >>     index = randbelow(len(to_try))
> >> >>
> >> >> This works perfectly, but it took some time. So I thought I try:
> >> >>     from random  import SystemRandom
> >> >>     index = SystemRandom().randint(0, len(to_try) - 1)
> >> >>
> >> >> A first indication is that the second version would take about two
> >> >> times as much time as the first. Is there a reason for this, or should
> >> >> this not be happening?
> >> >>
> >> >
> >> > You're setting up a brand new SystemRandom instance just for a single
> >> > random number. For a fairer comparison, set up the instance, then
> >> > generate far more than just a single number, and see how that goes.
> >>
> >> Thanks. I thought I did something wrong and I did.
> >> I will try to implement like you said and look what the result will
> >> be. (And share it.)
> >
> > Thanks! Don't feel bad; performance testing is *hard*, getting
> > meaningful results takes a lot of of fiddling with parameters, and
> > getting interesting AND meaningful results can sometimes seem about
> > impossible.
> >
> >> (As I understand it both do more, or less the same and should have
> >> comparable performance.)
> >
> > In normal production work? Yes (the SystemRandom object doesn't have
> > any significant state - a seeded RNG could have a lot more overhead
> > here). But for performance testing? The work of instantiating the
> > class could be completely irrelevant, or it could be dominating your
> > results. It's hard to say, hence the suggestion to try it without
> > reinstantiating.
>
> It had a very big influence. Original it took about three times more
> time to run my program. (The program was still running when I posted
> the original post and the difference was higher as I anticipated.)
> Removing that did cut about 45% of the execution time of the program.
> (So the initiation is quit expensive.)
> But it still takes about 50% more time. So I am still a bit
> flabbergasted.
>
> The new code:
>     from random  import SystemRandom
>     system_random   = SystemRandom()
>     index = system_random.randint(0, len(to_try) - 1)
>
> The first two statements are executed once.
> The last statement I think about 75 * 10 ** 6.
>
> So it seems that my first idea of using randbelow was the correct one.
> But if anyone could explain why SystemRandom is so much more
> expensive, I would be interested to know it.
> (Or am I still doing something wrong?)

Hmm. There are still a lot of differences here. Are you able to make
use of randrange() instead, to make them more consistent?

According to the source code, secrets.randbelow is calling on an
internal method _randbelow of the SystemRandom object, but randrange
(if called with only one arg) will go straight into that same method.
Here's my results:

rosuav at sikorsky:~$ python3 -m timeit -s 'from random import randrange'
'randrange(10000)'
1000000 loops, best of 5: 322 nsec per loop
rosuav at sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; r = SystemRandom()' 'r.randint(0, 10000)'
200000 loops, best of 5: 1.92 usec per loop
rosuav at sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; r = SystemRandom()' 'r.randrange(10000)'
200000 loops, best of 5: 1.87 usec per loop
rosuav at sikorsky:~$ python3 -m timeit -s 'from secrets import
randbelow' 'randbelow(10000)'
200000 loops, best of 5: 1.64 usec per loop

(The difference with the first one is that it isn't using the system
RNG, so it has the limitations of an internal PRNG.)

When you call randint, what happens is (1) the endpoint is incremented
to transform it from inclusive-inclusive to inclusive-exclusive; (2)
randrange is called with two args; (3) fast path 1 is skipped, fast
path 2 is taken, and _randbelow gets called to get an actual random
number, which gets zero added to it before returning.

If, instead, you use randrange(len(to_try)), what would happen is (1)
fast path 1 is used, and (2) _randbelow is called to get the random
number.

In secrets.randbelow, it's even better: (1) _randbelow is called to
get the random number.

I think that might be what's going on here. You're adding some tiny
amounts of extra work, but if you were to make them more similar, the
distinctions would evaporate. Here's a couple of other variants that
use SystemRandom:

rosuav at sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; randrange = SystemRandom().randrange' 'randrange(10000)'
200000 loops, best of 5: 1.81 usec per loop
rosuav at sikorsky:~$ python3 -m timeit -s 'from random import
SystemRandom; randrange = SystemRandom()._randbelow'
'randrange(10000)'
200000 loops, best of 5: 1.61 usec per loop

Note that we shave a few usec by avoiding the attribute lookup, but
even more by directly calling _randbelow, which is why
secrets.randbelow is able to save a bit. But don't use internal
methods; you can probably get pretty much equivalent performance from
randrange, which is public.

Hope that's of some value!

ChrisA