Shorter checksum than MD5

Paul Rubin http
Fri Sep 10 03:48:17 EDT 2004


danb_83 at yahoo.com (Dan Bishop) writes:
> > Where are the updates coming from?  Note that if you use a 32-bit
> > checksum, with 100000 records you will probably have some records with
> > the same checksum by accident.
> 
> Only if you use a checksum algorithm with really bad clustering problems.
> 
> If all 2**32 checksums are equally likely, the probability of a
> collision is only about 0.0000232828.

That's incorrect, the probability is much higher.  It's more like 0.7.

If you have 30 people in a room, do you know how to find the
probability that some two have the same birthday?



More information about the Python-list mailing list