Bug in timsort!?

Chris Angelico rosuav at gmail.com
Tue Feb 24 19:07:52 EST 2015


On Wed, Feb 25, 2015 at 10:50 AM, Skip Montanaro
<skip.montanaro at gmail.com> wrote:
> Even if/when we get to the point where machines can hold an array of
> 2**49 elements, I suspect people won't be using straight Python to
> wrangle them.

Looking just at CPython, what is the absolute minimum storage space
for a massive list like that? My guess is that a 64-bit CPython might
get as good as eight bytes per element; playing around with smaller
figures (up to circa a million elements) shows it ranging from 8 to 9
bytes per element, bottoming out at just a smidge over 8, presumably
at the moments when there's no slack space (there's a fixed overhead,
which gets diminishingly significant). So an array of 2**49 elements
would take 2**52 bytes (4 petabytes) of storage - addressable, to be
sure, but an awful lot to be sorting.

Would it be sufficient to stick a comment into the source saying "This
may have problems with lists in excess of 2**49 elements", and leave
it until that's actually likely to happen?

ChrisA



More information about the Python-list mailing list