Bug in timsort!?

Dave Angel davea at davea.name
Tue Feb 24 21:41:42 EST 2015


On 02/24/2015 07:07 PM, Chris Angelico wrote:
> On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
> <skip.montanaro at gmail.com> wrote:
>> Even if/when we get to the point where machines can hold an array of
>> 2**49 elements, I suspect people won't be using straight Python to
>> wrangle them.
>
> Looking just at CPython, what is the absolute minimum storage space
> for a massive list like that? My guess is that a 64-bit CPython might
> get as good as eight bytes per element; playing around with smaller
> figures (up to circa a million elements) shows it ranging from 8 to 9
> bytes per element, bottoming out at just a smidge over 8, presumably
> at the moments when there's no slack space (there's a fixed overhead,
> which gets diminishingly significant). So an array of 2**49 elements
> would take 2**52 bytes (4 petabytes) of storage - addressable, to be
> sure, but an awful lot to be sorting.
>
> Would it be sufficient to stick a comment into the source saying "This
> may have problems with lists in excess of 2**49 elements", and leave
> it until that's actually likely to happen?
>
> ChrisA
>

One could do much better than that.  Put in a test for the length of the 
list, and if it's too large for the current implementation, throw an 
exception.  Much better than a comment.

-- 
DaveA



More information about the Python-list mailing list