[ python-Bugs-1721372 ] emphasize iteration volatility for set

SourceForge.net noreply at sourceforge.net
Tue May 22 07:22:53 CEST 2007


Bugs item #1721372, was opened at 2007-05-18 17:10
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Documentation
Group: None
Status: Closed
Resolution: Rejected
Priority: 5
Private: No
Submitted By: Alan (aisaac0)
Assigned to: Nobody/Anonymous (nobody)
Summary: emphasize iteration volatility for set

Initial Comment:
For <URL:http://docs.python.org/lib/types-set.html>, append the following new sentence to the 2nd paragraph.

    Iteration over a set returns elements in an indeterminate order, which generally depends on factors outside the scope of the containing program.

*Justification:* users should not be expected to understand without being told that iteration order depends on factors outside the scope of the containing program. (Additionally, unlike the documentation for dictionaries, the documentation for sets fails to give a serious warning not to rely on iteration order.)


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2007-05-22 07:22

Message:
Logged In: YES 
user_id=21627
Originator: NO

aisaac0: The first sentence of types-set says "A set object is an
unordered collection of immutable values."

How else could you interpret "unordered collection" but as "having
arbitrary order"?

----------------------------------------------------------------------

Comment By: Alan (aisaac0)
Date: 2007-05-22 05:27

Message:
Logged In: YES 
user_id=1025672
Originator: YES

Note that on c.l.python, Raymond Hettinger justifies this rejection as
follows:

  "the docs are sufficient when they say that set ordering is arbitrary"

Where exactly do the docs say this? I do not see it.
I am looking here: <URL:http://docs.python.org/lib/types-set.html>

I also take this as a concession that the docs *should* say something like
this,
which is about half of the language I proposed
(unless there is some reason why 'arbitrary' is superior to
'indeterminate').

Btw, I did provide the source code to several people before Peter.
This was clear in the thread on c.l.python.
I do not think they would appreciate being called "not experienced".


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2007-05-19 23:29

Message:
Logged In: YES 
user_id=21627
Originator: NO

aisaac0, thanks for elaborating. Your remark now convinces me that it was
the right thing to reject this change.

Ite seems that you suggest that experienced users
a) are aware that some objects compare and hash by their id(), and
b) that the id() is the address in memory, and
c) that the id() will influence the order in which objects are iterated,
and
d) fail to see that the id() may differ across runs

Such users are *not* experienced. There are many more reasons why the id
of an object may vary across runs. E.g. Linux 2.6 deliberately randomizes
memory management, so that identical processes get their objects allocated
at different addresses, to defeat security exploits that rely on
deterministic address of things in main memory (there is a system call to
disable this randomization)

Looking at the entire thread, I agree with Carsten Haese's posting: That
even experienced users couldn't diagnose this correctly is because they
a) did not receive the source code, and
b) were talked into believing that this has to do something with the
random module.

The library reference is a specification, not a tutorial.


----------------------------------------------------------------------

Comment By: Alan (aisaac0)
Date: 2007-05-19 15:09

Message:
Logged In: YES 
user_id=1025672
Originator: YES

The previous comment completely misses the point.  Again, please see the
discussion on c.l.python.  Not one of the participants expected sets to be
"ordered". What was suprising to them was the order can *change* across
sequential executions of an **unchanged** source.   This is of course
*quite* different than expecting that sets are ordered; I am perplexed that
anyone would conflate the two.  One cannot credibly argue that anyone who
understands that sets are not ordered will not be surprised, since even
sophisticated users were as a matter of fact surprised in the c.l.python
discussion.  (Until it was explained by Peter of course.)  A natural
conclusion is that the docs should offer better protection against such
surprise, since we have concrete evidence that even sophisticated users can
be surprised by this.

In sum, the previous comment conflates two distinct issues and so fails to
address the reasons for the proposed docs patch.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2007-05-19 08:38

Message:
Logged In: YES 
user_id=21627
Originator: NO

The documentation already says "Being an unordered collection, sets do not
record element position or order of insertion."

If users read this and fail to understand the notion of an unordered
collection, I see no way of "fixing" this.

----------------------------------------------------------------------

Comment By: Alan (aisaac0)
Date: 2007-05-19 04:28

Message:
Logged In: YES 
user_id=1025672
Originator: YES

While I do not mind my language being rejected, *something* should be
added to warn users.  What the previous comment fails to mention is the
number of people on c.l.python, some of whom are quite sophisticated users,
who failed to discover the source of indeterminacy.  Users should not have
to "rediscover" this because of a documentation failure.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2007-05-19 01:08

Message:
Logged In: YES 
user_id=80475
Originator: NO

While the OP knows what he means here, the suggested text does not add
clarity, it only makes the subject harder to understand and implies that
some mysterious, dark force is in place.  Further, the suggested text is
simply incorrect.  Given deterministic assignment of hash values and a
consistent insertion order, the order of keys in a set or dictionary is
fully determined.

I've read the source of this suggestion on comp.lang.python and commented
there.  The underlying issue had nothing to do with either sets or dicts. 
The code in question "re-discovered" that the location of objects in memory
 would vary between runs if the user deleted a pyc file for a module.  The
OP's script used object ids as hash values, hence the set/dict ordering
could vary between runs.  This was at odds with his expectation that that
the ordering would be deterministic.  The moral is that non-deterministic
hash values lead to non-deterministic set/dict ordering.

The docs for sets and dicts should not be muddled with tangential
discussions about implementation specific details regarding what governs
where objects are placed in memory.


----------------------------------------------------------------------

Comment By: Alan (aisaac0)
Date: 2007-05-18 20:00

Message:
Logged In: YES 
user_id=1025672
Originator: YES

Location in memory.
See Peter Otten's discussion at
http://www.thescripts.com/forum/post2552380-16.html

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2007-05-18 19:05

Message:
Logged In: YES 
user_id=21627
Originator: NO

What factors outside the containing program influence iteration order?
Iteration is completely deterministic, and only depends on the items
inserted, and the order in which they were inserted, neither of which is
outside the scope of the containing program. It's just that the order is
not easily predictable.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&group_id=5470


More information about the Python-bugs-list mailing list