binary file compare...

SpreadTooThin bjobrien62 at gmail.com
Fri Apr 17 11:59:32 EDT 2009


On Apr 17, 4:54 am, Nigel Rantor <wig... at wiggly.org> wrote:
> Adam Olsen wrote:
> > On Apr 16, 11:15 am, SpreadTooThin <bjobrie... at gmail.com> wrote:
> >> And yes he is right CRCs hashing all have a probability of saying that
> >> the files are identical when in fact they are not.
>
> > Here's the bottom line.  It is either:
>
> > A) Several hundred years of mathematics and cryptography are wrong.
> > The birthday problem as described is incorrect, so a collision is far
> > more likely than 42 trillion trillion to 1.  You are simply the first
> > person to have noticed it.
>
> > B) Your software was buggy, or possibly the input was maliciously
> > produced.  Or, a really tiny chance that your particular files
> > contained a pattern that provoked bad behaviour from MD5.
>
> > Finding a specific limitation of the algorithm is one thing.  Claiming
> > that the math is fundamentally wrong is quite another.
>
> You are confusing yourself about probabilities young man.
>
> Just becasue something is extremely unlikely does not mean it can't
> happen on the first attempt.
>
> This is true *no matter how big the numbers are*.
>
> If you persist in making these ridiculous claims that people *cannot*
> have found collisions then as I said, that's up to you, but I'm not
> going to employ you to do anything except make tea.
>
> Thanks,
>
>    Nigel

You know this is just insane.  I'd be satisfied with a CRC16 or
something in the situation i'm in.
I have two large files, one local and one remote.  Transferring every
byte across the internet to be sure that the two files are identical
is just not feasible.  If two servers one on one side and the other on
the other side both calculate the CRCs and transmit the CRCs for
comparison I'm happy.



More information about the Python-list mailing list