md5 and large files

Brad Tilley bradtilley at gmail.com
Mon Oct 18 16:34:54 EDT 2004


Josiah Carlson wrote:
>>Of course I can't... *grin* -- But actually, (correct me if I'm wrong) an 
>>MD5 sum is 128 bits long, that are 2^128 different possibilities. Now a 2 
>>GB file has 8*2*1024^3 bits, that are 2^17179869184 different 
>>possibilities for a 2 GB file. Am I wright thinking that the number of 
>>files with an identical md5 sum is now 2^17179869184 / 2^128 = 
>>2^17179869056?
> 
> 
> (I know this is a major simplification, but bear with me)
> 
> Your probability and numbers are correct.  The problem is that 2^128 is
> so damn huge

All you need is 2^128+1 to find a duplicate, no? The problem, as I 
understand it, is getting to the end (2^128+1) as sufficient computing 
power isn't available... yet.



More information about the Python-list mailing list