sorting with expensive compares?

Steven D'Aprano steve at REMOVETHIScyber.com.au
Sat Dec 24 00:29:07 EST 2005


On Fri, 23 Dec 2005 21:07:57 -0800, Paul Rubin wrote:

> Steven D'Aprano <steve at REMOVETHIScyber.com.au> writes:
>> http://www.iusmentis.com/technology/encryption/pgp/pgpattackfaq/hash/
>> 
>> the expected number of random unique files you would need to compare
>> before finding a single collision in the MD5 hashes is (very roughly)
>> 10**70, or ten billion trillion trillion trillion trillion trillion.
> 
> That's not right, the right number is around 2**64 or 2e19.  See the
> section of that page about the birthday attack.  It's still an awful lot.

Oh poot, a stupid typo! I entered 1e60 in my calculation instead of 1e6,
which is doubly embarrassing because the correct number to use is 1e9:

[quote]
In MD5's case, 2**64 messages need to be tried. This is not a feasible
attack given today's technology. If you could try 1,000,000 messages per
second, it would take 584,942 years to find a collision. A machine that
could try 1,000,000,000 messages per second would take 585 years, on
average.
[end quote]

So the expected number of files (messages) needed to check to find a
collision is:

585*365*24*60*60*1e9 = 1.8e19

or more than ten million million million, which of course equals 2**64.




> There are also known ways of deliberately constructing md5 collisions
> (i.e. md5 is broken).  Whether the OP should care about that depends
> on the application.

Sure, but I don't he is deliberately trying to sabotage his own files :-)


-- 
Steven.




More information about the Python-list mailing list