Hpw make lists that are easy to sort.

kyosohma at gmail.com kyosohma at gmail.com
Wed Mar 28 14:14:52 EDT 2007


On Mar 28, 12:12 pm, Anton Vredegoor <anton.vredeg... at gmail.com>
wrote:
> Python's sorting algorithm takes advantage of preexisting order in a
> sequence:
>
> #sort_test.py
> import random
> import time
>
> def test():
>      n = 1000
>      k = 2**28
>
>      L = random.sample(xrange(-k,k),n)
>      R = random.sample(xrange(-k,k),n)
>
>      t = time.time()
>      LR = [(i+j) for i in L for j in R]
>      print time.time()-t
>      LR.sort()
>      print time.time()-t
>
>      print
>
>      t = time.time()
>      #L.sort()
>      R.sort()
>      presorted_LR = [(i+j) for i in L for j in R]
>      print time.time()-t
>      presorted_LR.sort()
>      print time.time()-t
>
> if __name__=='__main__':
>      test()
>
> On this -very slow- computer this prints:
>
>  >d:\python25\pythonw -u "sort_test.py"
> 1.10000014305
> 8.96000003815
>
> 1.10000014305
> 5.49000000954
>  >Exit code: 0
>
> Presorting the second sequence gains us more than three seconds. I
> wonder if there is a way to generate the combined items in such a way
> that sorting them is even faster? Is there some other sorting algorithm
> that can specifically take advantage of this way -or another way- of
> generating this list?
>
> The final sequence is len(L)*len(R) long but it is produced from only
> len(L)+len(R) different items, is it possible to exploit this fact? I'd
> also be interested in a more general solution that would work for
> summing the items of more than two lists in this way.
>
> A.

I found a website that hopefully will point you in the right
direction:

http://wiki.python.org/moin/HowTo/Sorting

And this one has an interesting profile of various sort methods with
Python:

http://www.biais.org/blog/index.php/2007/01/28/23-python-sorting-efficiency

Enjoy,

Mike




More information about the Python-list mailing list