[Tutor] PYTHONHASHSEED, -R

Mon Jul 29 23:27:24 CEST 2013

----- Original Message -----

> From: eryksun <eryksun at gmail.com>
> To: Albert-Jan Roskam <fomcl at yahoo.com>
> Cc: Python Mailing List <tutor at python.org>
> Sent: Monday, July 29, 2013 7:44 PM
> Subject: Re: [Tutor] PYTHONHASHSEED, -R
> 
>>  question is almost new-thread-worthy, but: if I would like to make
>>  my app work for 2.x and 3.x, what is the best approach:
>>  (a) use "if sys.version_info.major...." throughout the code
>>  (b) use 2to3, hope for the best and fix manually whatever can't be 
> fixed
> 
> (c) Use "six".

Ok, thanks, I will check this out. Btw, is Pythonbrew still the preferred way of working with different python versions? I read about it and it seems to be handy to use it to switch between versions. But I also read it is no longer maintained.

> BTW, sys.version_info.major doesn't work in <2.7. The names were added
> in 2.7/3.1.
> 
>>>  The dict isn't changing state,
>> 
>>  So that's the criterion! Thanks! So as long as you don't use
>>  __setitem__ and __delitem__ (maybe also __setattribute__,
>>  __delattribute__, ...) the state does not change.
> 
> It's __getattribute__, __getattr__, __setattr__ and __delattr__. I
> guess that's relevant if we're talking about a dict that's 
> functioning
> as a namespace, but that's a bit meta. Setting attributes on the dict
> itself (I guess it's a subclass we're talking about; normal dict
> instances don't have a dict) wouldn't affect the hash table it uses
> for the contained items.
> 
> BTW, these slot wrappers generally aren't called in CPython, not
> unless you're overriding the built-in slot function. They're a hook
> back into the C API. They bind as a method-wrapper that has a function
> pointer to a C wrapper function that calls the C slot function.
> 
> If you override __setattr__, your type's tp_setattro slot is set to
> slot_tp_setattro, which gets called by PyObject_SetAttr. If the value
> being 'set' is NULL, this function looks up "__delattr__" in 
> your
> type. Since you didn't override this, it finds and binds the
> __delattr__ slot wrapper from your base class(es).
> 
> This is also comes up with rich comparison functions, for which all 6
> comparisons are handled by the single slot function, tp_richcompare.
> So if you subclass dict and override __lt__, then slot_tp_richcompare
> finds the slot wrapper for the other 5 comparisons by searching dict's
> dict:
> 
>     >>> type(vars(dict)['__gt__'])
>     <type 'wrapper_descriptor'>
> 
> And binds it to the instance as a method-wrapper:
> 
>     >>> type(vars(dict)['__gt__'].__get__({}))
>     <type 'method-wrapper'>
> 
> After jumping through several hoops it ends up at dict_richcompare.
> For a regular dict, PyObject_RichCompare simply jumps straight to
> dict_richcompare. It doesn't use the slot wrapper.
> 
>>>      Keys and values are iterated over in an arbitrary order which is
>>>      non-random,
>> 
>>  That sounds like a contradictio in terminis to me. How can something
>>  be non-random and arbitrary at the same time?
> 
> It's just worded generally to be valid for all implementations. They
> could have gone into the specifics of the open-addressing hash table
> used by CPython's dict type, but it would have to be highlighted as an
> implementation detail. Anyway, the table has a history (collisions
> with other keys, dummy keys) and size that affects the insertion
> order. It isn't ontologically random; it's contingent. But that's
> getting too philosophical I think.
> 
>>>  CPython 3.3 defaults to enabling hash randomization. Set the
>>>  environment variable PYTHONHASHSEED=0 to disable it.
>> 
>>  So in addition to my "2to3" question from above, it might be a 
> good
>>  idea to already set PYTHONHASHSEED so Python 2.x behaves like
>>  Python 3.x in this respect, right? Given that the environment
>>  variables are already loaded once Python has started, what would be
>>  the approach to test this? Call os.putenv("PYTHONHASHSEED", 1), 
> and
>>  then run the tests in a subprocess (that would know about the
>>  changed environment variables)?
> 
> Sorry, I don't see the point of this. PYTHONHASHSEED is to be set by a
> system administrator as a security measure. I think CPython is the
> only implementation that has this feature.

I was referring to what is at the bottom of this page (and I believe it's also mentioned in the doctest documentation):
http://python3porting.com/problems.html
Basically setting PYTHONHASHSEED will help you find doctests that are badly written, even prior to Python 3.3, but from 3.3 onwards they will fail no matter what. I hope I am not too vague now ;-)

> With regard to tests that depend on the PYTHON* environment variables,
> Python's own tests use subprocess.Popen to run sys.executable with a
> modified "env" environment (e.g. checking the effect of
> PYTHONIOENCODING).

Ah, ok, glad to read that this is also sort of what I had in mind.