best way to determine sequence ordering?

nikie n.estner at gmx.de
Sat Apr 29 19:52:50 EDT 2006


Steven Bethard wrote:

> nikie wrote:
> > Steven Bethard wrote:
> >
> >> nikie wrote:
> >>> Steven Bethard wrote:
> >>>
> >>>> John Salerno wrote:
> >>>>> If I want to make a list of four items, e.g. L = ['C', 'A', 'D', 'B'],
> >>>>> and then figure out if a certain element precedes another element, what
> >>>>> would be the best way to do that?
> >>>>>
> >>>>> Looking at the built-in list functions, I thought I could do something
> >>>>> like:
> >>>>>
> >>>>> if L.index('A') < L.index('D'):
> >>>>>     # do some stuff
> >>>> This is probably a pretty reasonable approach as long as you're not too
> >>>> worried about efficiency.  It scans the list twice though, so if you
> >>>> start doing this with really big lists, it might be better to do
> >>>> something like:
> >>> On average, it'll only have go through half the list twice, as it can
> >>> break as soon as it finds the item it searches for. Don't think you can
> >>> get any more efficient than that.
> >> Sure you can:
> >>
> >>      a_index = None
> >>      d_index = None
> >>      for i, item in enumerate(L):
> >>          if item == 'A':
> >>              a_index = i
> >>          elif item == 'D':
> >>              d_index = i
> >>          if a_index is not None and d_index is not None:
> >>              break
> >>
> >>      if a_index < d_index:
> >>          # do some stuff
> >>
> >> That goes through exactly the minimum number of elements.  But it's
> >> probably slower than two .index() calls since .index() is coded in C.
> >> But timeit and see if you're interested.
> >
> > On my PC, calling index twice is 5 times faster.
> >
> > But let's just pretend for a moment Python code was compiled and
> > optimized as well as C code.
>
> Ok, lets get comparable functions by writing them both in Python:

That's changing the subject, and you know that. The OP asked whether
using "L.index('A') < L.index('D')" was a good (pythonic, efficient)
idea. It is. Maybe it wouldn't be efficient if "List.index" was
implemented in python (because, as I have already said in my previous
post, looping through an enumerate-object in python is more complex
than a simple C-loop), but it is actually implemented in C. Even if you
wrote a C function that tried to do all the comparisons in one sweep
through the list, I'm willing to bet it won't be faster than the OP's
suggestion on the average (at least for moderate-sized lists,
otherwise, processing the list in blocks, using the cache more
efficiently might be a good idea).

That's what this thread was all about. Now, I don't really see what you
are trying to say: Are you still trying to convince the OP that he
should write a Python function like one of those you suggested, for
performance reasons? If not, I must have misunderstood your last posts,
and apologize for repeating the obvious ;-)




More information about the Python-list mailing list