Multi-dimensional list initialization

Andrew Robinson andrew3 at r3dsolutions.com
Tue Nov 6 17:41:24 EST 2012


On 11/06/2012 01:04 AM, Steven D'Aprano wrote:
> On Mon, 05 Nov 2012 21:51:24 -0800, Andrew Robinson wrote:
>
>> The most compact notation in programming really ought to reflect the
>> most *commonly* desired operation.  Otherwise, we're really just making
>> people do extra typing for no reason.
> There are many reasons not to put minimizing of typing ahead of all other
> values:
I didn't.  I put it ahead of *some* values for the sake of practicality 
and human psychology.
" Practicality beats purity. "

>
> * Typically, code is written once and read many times. Minimizing
>    typing might save you a second or two once, and then cost you many
>    seconds every time you read the code. That's why we tell people to
>    choose meaningful variable names, instead of naming everything "a"
>    and "b".
Yes.  But this isn't going to cost any more time than figuring out 
whether or not the list multiplication is going to cause quirks, 
itself.  Human psychology *tends* (it's a FAQ!) to automatically assume 
the purpose of the list multiplication is to pre-allocate memory for the 
equivalent (using lists) of a multi-dimensional array.  Note the OP even 
said "4d array".

The OP's original construction was simple, elegant, easy to read and 
very commonly done by newbies learning the language because it's 
*intuitive*.  His second try was still intuitive, but less easy to read, 
and not as elegant.

>
> * Consistency of semantics is better than a plethora of special
>    cases. Python has a very simple and useful rule: objects should
>    not be copied unless explicitly requested to be copied. This is
>    much better than having to remember whether this operation or
>    that operation makes a copy. The answer is consistent:
Bull.  Even in the last thread I noted the range() object produces 
special cases.
 >>> range(0,5)[1]
1
 >>> range(0,5)[1:3]
range(1, 3)
 >>>

The principle involved is that it gives you what you *usually* want;  I 
read some of the documentation on why Python 3 chose to implement it 
this way.

>
>    (pardon me for belabouring the point here)
>
>      Q: Does [0]*10 make ten copies of the integer object?
>      A: No, list multiplication doesn't make copies of elements.
Neither would my idea for the vast majority of things on your first list.

     Q: What about [[]]*10?
     A: No, the elements are never copied.

YES! For the obvious reason that such a construction is making mutable 
lists that the user wants to populate later.  If they *didn't* want to 
populate them later, they ought to have used tuples -- which take less 
overhead.  Who even does this thing you are suggesting?!

 >>> a=[[]]*10
 >>> a
[[], [], [], [], [], [], [], [], [], []]
 >>> a[0].append(1)
 >>> a
[[1], [1], [1], [1], [1], [1], [1], [1], [1], [1]]

Oops! Damn, not what anyone normal wants....

>      Q: How about if the elements are subclasses of list?
>      A: No, the elements are never copied.
Another poster brought that point up -- it's something I would have to 
study before answering.
It's a valid objection.

>
>      Q: What about other mutable objects like sets or dicts?
>      A: No, the elements are never copied.
They aren't list multiplication compatible in any event! It's a total 
nonsense objection.

If these are inconsistent in my idea -- OBVIOUSLY -- they are 
inconsistent in Python's present implementation.  You can't even 
reference duplicate them NOW.

 >>> { 1:'a', 2:'b', 3:'c' } * 2
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for *: 'dict' and 'int'

>      Q: How about on Tuesdays? I bet they're copied on Tuesdays.
>      A: No, the elements are never copied.
That's really a stupid objection, and everyone knows it.
" Although that way may not be obvious at first unless you're Dutch. "

>
> Your proposal throws away consistency for a trivial benefit on a rare use-
> case, and replaces it with a bunch of special cases:
RARE!!!! You are NUTS!!!!

>      Q: What about [[]]*10?
>      A: Oh yeah, I forgot about lists, they're copied.
Yup.

>      Q: How about if the elements are subclasses of list?
>      A: Hmmm, that's a good one, I'm not actually sure.
>
>      Q: How about if I use delegation to proxy a list?
>      A: Oh no, they definitely won't be copied.
Give an example usage of why someone would want to do this.  Then we can 
discuss it.
>      Q: What about other mutable objects like sets or dicts?
>      A: No, definitely not. Unless people complain enough.
now you're just repeating yourself to make your contrived list longer -- 
but there's no new objections...

> Losing consistency in favour of saving a few characters for something as
> uncommon as list multiplication is a poor tradeoff. That's why this
> proposal has been rejected again and again and again every time it has
> been suggested.
Please link to the objection being proposed to the developers, and their 
reasoning for rejecting it.
I think you are exaggerating.

> List multiplication [x]*n is conceptually equivalent to:
> <snip>
> This is nice and simple and efficient.
No it isn't efficient. It's *slow* when done as in your example.

> Copying other objects is slow and inefficient. Keeping list
> multiplication consistent, and fast, is MUCH more important than making
> it work as expected for the rare case of 2D arrays:
I don't think so -- again, look at range(); it was made to work 
inconsistent for a "common" case.

Besides, 2D arrays are *not* rare and people *have* to copy internals of 
them very often.
The copy speed will be the same or *faster*, and the typing less -- and 
the psychological mistakes *less*, the elegance more.

It's hardly going to confuse anyone to say that lists are copied with 
list multiplication, but the elements are not.

Every time someone passes a list to a function, they *know* that the 
list is passed by value -- and the elements are passed by reference.  
People in Python are USED to lists being "the" way to weird behavior 
that other languages don't do.

>
> Copying those elements does not come for free.
>
> It is true that list multiplication can be much faster than a list comp.
> But that's because the list multiply doesn't have to inspect the
> elements, copy them, or engage the iteration machinery. Avoiding copying
> gives you a big saving:
>
>
> [steve at ando ~]$ python3.3 -m timeit -s "x = range(1000)"
> "[x for _ in range(100)]"  # not copied
> 100000 loops, best of 3: 11.9 usec per loop
>
> [steve at ando utilities]$ python3.3 -m timeit -s "x = range(1000)"
> "[x[:] for _ in range(100)]"  # copied
> 10000 loops, best of 3: 103 usec per loop
>
> So there's a factor of ten difference right there. If list multiplication
> had to make copies, it would lose much of its speed advantage.
And when multiplication doesn't make copies of *lists*, it's going 
"nowhere fast", because people don't want the results that gives.

So what difference does it make?  People won't make the construction 
unless they wanted to make the copies in the first place.  If they want 
the copies, well -- copies are *slow*.  Big deal.

>   For large
> enough lists, or complicated enough objects, it would become slower than
> a list comprehension.
Huh? You're nuts.

> It would be even slower if list multiplication had to inspect each
> element first and decide whether or not to copy.
A single pointer comparison in a 'C' for loop takes less than 5 nano 
seconds on a 1Ghz machine.
(I'll bet yours is faster than that...!)
Consider: list objects have a pointer which points back to the generic 
list object -- that's all it takes to determine what the "type" is.

Your measured loop times, doing list comprehensions takes over 10 
microseconds *per loop*.
Compared to what you're proposing -- The pointer compare is a mere 0.05% 
change;  You can't even measure that with "timeit!".  BUT: The increase 
in speed for not running tokenized "for" loops is *much* bigger than the 
loss for a single pointer compare; so it will *usually* be a *serious* 
net gain.

>> I really don't think doing a shallow copy of lists would break anyone's
>> program.
> Anyone who is currently using list multiplication with mutable objects is
> expecting that they will be the same object, and relying on that fact.
> Otherwise they wouldn't be using list multiplication.
yes, and I'm not changing that -- except for lists; and *no* one is 
using that.
Find two examples of it from existing non contrived web examples of 
Python code.
*ask* around.

>
> You're suggesting a semantic change. Therefore they will be expecting
> something different from what actually happens. Result: broken code.
Even if it was;  So are many semantic changes happening between python 2 
and python 3.
Look at what python 2 did:

 >>> range(0,5)[0]
0
 >>> range(0,5)[1:3]
[1, 2]

That's a *semantic* change.
Also; if you complain that xrange has been renamed range; then look:

 >>> xrange(0,5)[0]
0
 >>> xrange(0,5)[1:3]
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: sequence index must be integer, not 'slice'

WOW. WOW. WOW.  An even BIGGER semantic change.


> It's not just mutable objects. It's also objects that can't be copied.
> Result: mylist*3 used to work, now it raises an exception. And
> performance issues: what used to be fast is now slow.
Where do you get off??; a list can be copied -- the contents might not.

> Even if this change was allowed, it would have to go through a multi-year
> process.
Fine.  if that's normal -- then let them process it the normal way.  
That's not my concern in the slightest.

> to get the behaviour you want, and then in Python 3.5 it would become the
> default. That's three years until it becomes the standard. Meanwhile,
> there will still be millions of people using Python 2.7 or 3.2, and their
> code will behave differently from your code.
Uh, they aren't *using* the construction I am proposing now -- they are 
avoiding it like the plague.
Hence, it will merely become a new ability in a few years -- not 
'differently' behaving code.

The rest of your repetitive nonsense has been deleted.
:(




More information about the Python-list mailing list