Slices time complexity

Tue May 19 13:07:03 EDT 2015

On Wed, 20 May 2015 12:52 am, Bartc wrote:

> "Steven D'Aprano" <steve+comp.lang.python at pearwood.info> wrote in message
> news:555b0621$0$2753$c3e8da3$76491128 at news.astraweb.com...
[...]
>> I'm sure that lots of things surprise people who assume that every
>> language's performance characteristics are identical.
>>
>> Python is not Lisp. Lisp is not Java. Java is not Forth. Forth is not
>> Lua. Lua is not Fortran. Fortran is not PHP. Do you see a pattern yet?
>> :-)
> 
> But sometimes you want an algorithm that works perfectly well in one
> language to work just as well in another.

Yeah, and I want a million dollars, a pony, and the ability to eat as much
as I like without getting fat too. Life is cruel, and then you die.

[...]
> I'm surprised about the slice thing too; I was getting used to the idea of
> Python only passing around references, even to small integer values, and
> now here it goes and does a copy!

Yes. And? Are you surprised that dict.copy() makes a copy? Presumably not.
Well, the way we spell list.copy() is list[:]

> But apparently only a shallow (top-level) copy. Still, it sounds expensive
> if you just want to pass part of a list to a function, not the whole
> thing, a function which is not going to write into the list (like the OP's
> example).

Yes, a slice can be expensive, if you have (say) a ten billion element list,
and take a slice list[1:]. And, to be perfectly honest, there's no real
good solution for that in standard Python. Numpy arrays implement slicing
and cloning of arrays as views, but the Python standard library doesn't
have anything that does the same.

But, really... if you're serious about dealing with huge arrays of data, you
want something like numpy, not Python lists. So *in practice* the lack of
views for lists is a minor nuisance, if that. Having list slices return
copies rather than views avoids more problems than it causes.

On the other hand, if Python implementations would implement slices with
copy-on-write, that would give us the best of both worlds!

>>> Is there any reason why Python 3.4 implementation of slices cannot be
>>> a near constant operation?
>>
>> Yes. (1) Backwards compatibility. (2) That would go against the design of
>> Python slices that they are copies. (3) That would upset those who want
>> and
>> expect a slice to be a copy.
> 
> Or you could just have a different kind of slice that is just a 'view'.

Sure. And you can program it yourself, if you like.

True confession time: I've been using Python for over 15 years. For at least
12 of those years, I've found myself feeling guilty every time I write a
loop like "for x in seq[1:]" to skip the first element, because I'm worried
about copying a huge list. Every single time I think "I really ought to
write a list view." And then I think about the effort involved (minor)
versus the benefit (even smaller) and I think "screw it, I'll just make a
copy".

And in 12 years, this lazy "put off until tomorrow" attitude to list views
has failed me exactly *zero* times. I won't say You're Not Gonna Need It,
but I will say *I've* Never Needed It Yet.

I think that's why Python doesn't have a generalised sequence slice view.
It's not that the core developers are against the idea. It's just that,
weighing up the disadvantages, the advantages, and the effort involved, a
slice view has never been high enough on their priority list to get
included. The people who really need it already have numpy.

What will be off the table is changing the behaviour of list slicing. But
adding a sequence view? That's plausible. It just needs somebody to do the
work first.

>> I'll tell you what -- Python slices have worked this way for over 20
>> years.
>> You should propose to the Go designers that Go is confusing because it
>> doesn't match Python's behaviour, and they should change the meaning of
>> slices in Go to match Python. There's not as much Go code in the world as
>> Python code, so that will be far less disruptive. Tell them that Go
>> should be just like Python, otherwise it will confuse users who expect Go
>> slices to
>> behave like Python slices.
> 
> I agree with the OP, the way slices work in Python is confusing to those
> who expect them to work like array assignments. So if B is a
> million-element list, then A=B does not copy a million elements, neither
> shallow or deep.

Why would you expect slicing to be the same as assignment? Do you expect
list.append to be the same as assignment?

mylist = list(range(40))
mylist[3:26:3] = [1, 2, 3, 4, 5, 6, 7, 8]

How do you expect that to work, if not by copying?

> But A=B[1:] copies 999999 shallow elements. (Pointers I expect, and might
> need to change the reference counts of those 999999 elements. And when
> that slice is discarded after perhaps a handful of accesses, it might need
> to change them all back.

You say that as if it were expensive.

-- 
Steven