Non-deterministic set ordering

MRAB python at mrabarnett.plus.com
Sun May 15 23:40:17 EDT 2022


On 2022-05-16 04:20, Rob Cliffe via Python-list wrote:
> 
> 
> On 16/05/2022 04:13, Dan Stromberg wrote:
>>
>> On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list 
>> <python-list at python.org> wrote:
>>
>>     I was shocked to discover that when repeatedly running the following
>>     program (condensed from a "real" program) under Python 3.8.3
>>
>>     for p in { ('x','y'), ('y','x') }:
>>          print(p)
>>
>>     the output was sometimes
>>
>>     ('y', 'x')
>>     ('x', 'y')
>>
>>     and sometimes
>>
>>     ('x', 'y')
>>     ('y', 'x')
>>
>>     Can anyone explain why running identical code should result in
>>     traversing a set in a different order?
>>
>>
>> Sets are defined as unordered so that they can be hashed internally to 
>> give O(1) operations for many tasks.
>>
>> It wouldn't be unreasonable for sets to use a fixed-by-arbitrary 
>> ordering for a given group of set operations, but being unpredictable 
>> deters developers from mistakenly assuming they are ordered.
>>
>> If you need order, you should use a tuple, list, or something like 
>> https://grantjenks.com/docs/sortedcontainers/sortedset.html
> Thanks, I can work round this behaviour.
> But I'm curious: where does the variability come from?  Is it deliberate
> (as your answer seems to imply)?  AFAIK the same code within the *same
> run* of a program does produce identical results.

Basically, Python uses hash randomisation in order to protect it against 
denial-of-service attacks. (Search for "PYTHONHASHSEED" in the docs.)

It also applied to dicts (the code for sets was based on that for 
dicts), but dicts now remember their insertion order.


More information about the Python-list mailing list