[Python-Dev] Hash collision security issue (now public)

Eric Snow ericsnowcurrently at gmail.com
Wed Jan 4 20:15:46 CET 2012


On Wed, Jan 4, 2012 at 12:59 AM, Maciej Fijalkowski <fijall at gmail.com> wrote:
> On Wed, Jan 4, 2012 at 12:02 AM, Bill Janssen <janssen at parc.com> wrote:
>> Christian Heimes <lists at cheimes.de> wrote:
>>
>>> Am 29.12.2011 12:13, schrieb Mark Shannon:
>>> > The attack relies on being able to predict the hash value for a given
>>> > string. Randomising the string hash function is quite straightforward.
>>> > There is no need to change the dictionary code.
>>> >
>>> > A possible (*untested*) patch is attached. I'll leave it for those more
>>> > familiar with unicodeobject.c to do properly.
>>>
>>> I'm worried that hash randomization of str is going to break 3rd party
>>> software that rely on a stable hash across multiple Python instances.
>>> Persistence layers like ZODB and cross interpreter communication
>>> channels used by multiprocessing may (!) rely on the fact that the hash
>>> of a string is fixed.
>>
>> Software that depends on an undefined hash function for synchronization
>> and persistence deserves to break, IMO.  There are plenty of
>> well-defined hash functions available for this purpose.
>>
>> Bill
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com
>
> A lot of software will break their tests, because dict ordering would
> depend on the particular run. I know, because some of them break on
> pypy which has a different dict ordering. This is probably a good
> thing in general, but is it really worth it? People will install
> python 2.6.newest and stuff *will* break.

So if we're making the new hashing the default and giving an option to
use the old, we should make it _really_ clear in the release
notes/announcement about how to revert the behavior.

-eric

>
> Is it *really* a security issue? We knew all along that dicts are
> O(n^2) in worst case scenario, how is this suddenly a security
> problem?
>
> Cheers,
> fijal
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/ericsnowcurrently%40gmail.com


More information about the Python-Dev mailing list