Randomizing Strings In A Microservices World

Chris Angelico rosuav at gmail.com
Tue Dec 10 13:37:21 EST 2019


On Wed, Dec 11, 2019 at 5:01 AM Tim Daneliuk <info at tundraware.com> wrote:
>
> On 12/10/19 10:36 AM, Peter Pearson wrote:
> > Just to be sure: you *are* aware that the "Birthday Paradox" says
> > that if you pick your 10-digit strings truly randomly, you'll probably
> > get a collision by the time of your 10**5th string . . . right?
>
> I did not consider this, but the point is taken.
>
> Could you kindly point me to a source for calculating this given
> n-digit numeric-only strings?
>

The exact formula is pretty gnarly, but you can get remarkably close
by assuming that you're likely to get a collision at the square root -
which is half the exponent (so a 16-bit checksum will collide after
about 2**8 examples, and a 128-bit UUID4 will collide after about
2**64 UUIDs are generated).

https://en.wikipedia.org/wiki/Birthday_problem

ChrisA


More information about the Python-list mailing list