best way to determine sequence ordering?

Edward Elliott nobody at 127.0.0.1
Sat Apr 29 20:07:12 EDT 2006


Steven Bethard wrote:
> Ok, lets get comparable functions by writing them both in Python:

First of all, your functions aren't quite comparable.  The first index takes
the value to locate as a variable, while the second has both values
hard-coded as literals.  Changing the second one to index2(L, a, b) makes a
slight but not significant difference in the timings.

Secondly, the lists you tested with are artificially short.  As you increase
the size of the list, the difference lessens to unimportance:

$ python -m timeit -s "import temp; L = ['x'] * 500 + ['C', 'A', 'D', 'B']"
"a, d = temp.index2(L,'A','D'); a < d"
1000 loops, best of 3: 227 usec per loop

$ python -m timeit -s "import temp; L = ['x'] * 500 + ['C', 'A', 'D', 'B']"
"a = temp.index1(L, 'A'); d = temp.index1(L, 'D'); a < d"
1000 loops, best of 3: 236 usec per loop

Third, the target values are very close together in those lists.  If there's
a large difference in their positions, then the "two-in-one-pass" algorithm
becomes much slower:

$ python -m timeit -s "import temp; L = ['C','A'] + ['x'] * 500 + ['D',
'B']" "a = temp.index1(L, 'A'); d = temp.index1(L, 'D'); a < d"
10000 loops, best of 3: 120 usec per loop

$ python -m timeit -s "import temp; L = ['C','A'] + ['x'] * 500 + ['D',
'B']" "a, d = temp.index2(L,'A','D'); a < d"
1000 loops, best of 3: 267 usec per loop

Remember kids:
1. Numbers can show anything
2. Know your data set
3. Premature optimizations are evil





More information about the Python-list mailing list