BUG? sha-moduel returns same crc for different files
Erno Kuusela
erno-news at erno.iki.fi
Mon Sep 18 11:43:10 EDT 2000
>>>>> "Treutwein" == Treutwein Guido <Guido.Treutwein at nbg.siemens.de> writes:
Treutwein> c) While the probabilty of a file, to have a certain
Treutwein> hash code is 2^(-hash_bitlength), the probability of
Treutwein> finding two files with the same hash value is MUCH
Treutwein> bigger; this is the so-called birthday paradox. (due to
Treutwein> the fact, that having 23 persons in a room, the
Treutwein> probabilty of having to with the same birthday is
Treutwein> better than 50%; for a 32bit-CRC the corresponding
Treutwein> limit is about 77000 files for a 50% chance).
2^32 is much smaller than 2^160 (2^128 times smaller infact). how many
files would be needed for there to be a 50% change of a sha-1 hash
collision? (how is it calculated?) 2^160 is
1461501637330902918203684832716283019655932542976...
Treutwein> For this reason, standardization bodies move towards
Treutwein> larger hash sizes like 256 bit.
which standards/hashes?
(not that i disbelieve you; i don't know a lot about cryptography and
i'm curious.)
Treutwein> If you don't have the time to write a hash function as
Treutwein> C extension package consider to use a combination of
Treutwein> sha-1, md-5, file size and crc32
since there are 4294967296 times more possible values for sha-1
than for md5, methinks this would not make much difference.
-- erno
More information about the Python-list
mailing list