Slices time complexity

Thu May 21 03:35:55 EDT 2015

On Thursday 21 May 2015 05:51, Mario Figueiredo wrote:

> On Wed, 20 May 2015 03:07:03 +1000, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
[...]
>>But, really... if you're serious about dealing with huge arrays of data,
>>you want something like numpy, not Python lists.
> 
> You don't need huge. On any algorithm where slices are being used you
> will have to account for the linear complexity. You seem to be
> dismissive of this fact, but it can have tremendous performance
> implications, since it can turn a linear algorithm into a quadratic
> one just like that.

Sure. And for small enough N, everything is fast.

There's a sorting algorithm, usually called "Bozo sort", which shuffles the 
list, then checks whether it is sorted. If not, it shuffles it again, until 
it is sorted. Bozo sort is a *terrible* algorithm, with no upper limit to 
the worst case, and O(N!) for the average case. And yet, with small lists 
(say, five items) the performance is acceptable if your requirements are 
low.

Should you use Bozo sort in production? No, of course not. But the point I 
am making is that Big Oh analysis doesn't tell you how fast a particular 
implementation will run, only how it will scale for "sufficiently large" N. 
An O(N**2) algorithm with a small multiplier may be much faster than an O(N) 
algorithm with a large multiplier for all the data sets you care about.

> Certainly there are alternatives and iterating through the list is a
> much better method. There's no arguing against that. But if you are
> paying attention to how Python is being taught, [1:] is a little
> everywhere on the web announced as a cool trick showcasing Python
> strengths. Well, it turns out it is actually a bad practice
> performance-wise under many cases 

That has not been my experience, or many other people's.

If your experience is different from mine, then as I said, there is numpy, 
or you can write your own views. Just make sure that in trying to optimize 
your code, you don't end up actually making it *slower* by optimizing for 
more data than you ever will, or could, have to deal with. Big Oh 
theoretical analysis only goes so far, you need actual benchmarks and 
profiling to make good decisions about which algorithm you should use.

-- 
Steve