Slices time complexity
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Thu May 21 03:35:55 EDT 2015
On Thursday 21 May 2015 05:51, Mario Figueiredo wrote:
> On Wed, 20 May 2015 03:07:03 +1000, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
[...]
>>But, really... if you're serious about dealing with huge arrays of data,
>>you want something like numpy, not Python lists.
>
> You don't need huge. On any algorithm where slices are being used you
> will have to account for the linear complexity. You seem to be
> dismissive of this fact, but it can have tremendous performance
> implications, since it can turn a linear algorithm into a quadratic
> one just like that.
Sure. And for small enough N, everything is fast.
There's a sorting algorithm, usually called "Bozo sort", which shuffles the
list, then checks whether it is sorted. If not, it shuffles it again, until
it is sorted. Bozo sort is a *terrible* algorithm, with no upper limit to
the worst case, and O(N!) for the average case. And yet, with small lists
(say, five items) the performance is acceptable if your requirements are
low.
Should you use Bozo sort in production? No, of course not. But the point I
am making is that Big Oh analysis doesn't tell you how fast a particular
implementation will run, only how it will scale for "sufficiently large" N.
An O(N**2) algorithm with a small multiplier may be much faster than an O(N)
algorithm with a large multiplier for all the data sets you care about.
> Certainly there are alternatives and iterating through the list is a
> much better method. There's no arguing against that. But if you are
> paying attention to how Python is being taught, [1:] is a little
> everywhere on the web announced as a cool trick showcasing Python
> strengths. Well, it turns out it is actually a bad practice
> performance-wise under many cases
That has not been my experience, or many other people's.
If your experience is different from mine, then as I said, there is numpy,
or you can write your own views. Just make sure that in trying to optimize
your code, you don't end up actually making it *slower* by optimizing for
more data than you ever will, or could, have to deal with. Big Oh
theoretical analysis only goes so far, you need actual benchmarks and
profiling to make good decisions about which algorithm you should use.
--
Steve
More information about the Python-list
mailing list