random.SystemRandom().randint() inefficient

Tue Jul 26 17:47:59 EDT 2022

Chris Angelico <rosuav at gmail.com> writes:

> On Wed, 27 Jul 2022 at 06:06, Cecil Westerhof via Python-list
> <python-list at python.org> wrote:
>>
>> Chris Angelico <rosuav at gmail.com> writes:
>>
>> > On Wed, 27 Jul 2022 at 01:06, Cecil Westerhof via Python-list
>> > <python-list at python.org> wrote:
>> >>
>> >> I need to get a random integer. At first I tried it with:
>> >>     from secrets import randbelow
>> >>     index = randbelow(len(to_try))
>> >>
>> >> This works perfectly, but it took some time. So I thought I try:
>> >>     from random  import SystemRandom
>> >>     index = SystemRandom().randint(0, len(to_try) - 1)
>> >>
>> >> A first indication is that the second version would take about two
>> >> times as much time as the first. Is there a reason for this, or should
>> >> this not be happening?
>> >>
>> >
>> > You're setting up a brand new SystemRandom instance just for a single
>> > random number. For a fairer comparison, set up the instance, then
>> > generate far more than just a single number, and see how that goes.
>>
>> Thanks. I thought I did something wrong and I did.
>> I will try to implement like you said and look what the result will
>> be. (And share it.)
>
> Thanks! Don't feel bad; performance testing is *hard*, getting
> meaningful results takes a lot of of fiddling with parameters, and
> getting interesting AND meaningful results can sometimes seem about
> impossible.
>
>> (As I understand it both do more, or less the same and should have
>> comparable performance.)
>
> In normal production work? Yes (the SystemRandom object doesn't have
> any significant state - a seeded RNG could have a lot more overhead
> here). But for performance testing? The work of instantiating the
> class could be completely irrelevant, or it could be dominating your
> results. It's hard to say, hence the suggestion to try it without
> reinstantiating.

It had a very big influence. Original it took about three times more
time to run my program. (The program was still running when I posted
the original post and the difference was higher as I anticipated.)
Removing that did cut about 45% of the execution time of the program.
(So the initiation is quit expensive.)
But it still takes about 50% more time. So I am still a bit
flabbergasted.

The new code:
    from random  import SystemRandom
    system_random   = SystemRandom()
    index = system_random.randint(0, len(to_try) - 1)

The first two statements are executed once.
The last statement I think about 75 * 10 ** 6.

So it seems that my first idea of using randbelow was the correct one.
But if anyone could explain why SystemRandom is so much more
expensive, I would be interested to know it.
(Or am I still doing something wrong?)

-- 
Cecil Westerhof
Senior Software Engineer
LinkedIn: http://www.linkedin.com/in/cecilwesterhof