binary file compare...

Adam Olsen rhamph at gmail.com
Wed Apr 15 15:05:39 EDT 2009


On Apr 15, 12:56 pm, Nigel Rantor <wig... at wiggly.org> wrote:
> Adam Olsen wrote:
> > The chance of *accidentally* producing a collision, although
> > technically possible, is so extraordinarily rare that it's completely
> > overshadowed by the risk of a hardware or software failure producing
> > an incorrect result.
>
> Not when you're using them to compare lots of files.
>
> Trust me. Been there, done that, got the t-shirt.
>
> Using hash functions to tell whether or not files are identical is an
> error waiting to happen.
>
> But please, do so if it makes you feel happy, you'll just eventually get
> an incorrect result and not know it.

Please tell us what hash you used and provide the two files that
collided.

If your hash is 256 bits, then you need around 2**128 files to produce
a collision.  This is known as a Birthday Attack.  I seriously doubt
you had that many files, which suggests something else went wrong.



More information about the Python-list mailing list