is there a better way?

Sat Feb 11 09:38:19 EST 2006

Alex Martelli wrote:
> markscala at gmail.com <markscala at gmail.com> wrote:
> 
> 
>>Problem:
>>
>>You have a list of unknown length, such as this: list =
>>[X,X,X,O,O,O,O].  You want to extract all and only the X's.  You know
>>the X's are all up front and you know that the item after the last X is
>>an O, or that the list ends with an X.  There are never O's between
>>X's.
> 
> 
> If the list is incredibly long, you should use a bisection approach.
> Standard module bisect in the Python library could help, but mostly as
> an _example_, since it unfortunately relies on normal alphabetic order,
> and alphabetically speaking X should come _after_ O, not _before_.
> 
> But the algorithm is still sound:
> 
> 1. look at the midpoint.
> 2. if it's an X, so are all previous items -- recurse to second half
> 3. if it's an O, so are all following items -- recurse to first half
> 
> Getting all conditions exactly right is tricky (which is why bisect is a
> good model!), but this way you get O(log N) performance for a list of
> length N.
> 
> If N is not too huge, O(N) might be OK, and is, of course, way simpler
> to code!-)
> 

The code would look something like this:

low = 0
high = len(L)
while low < high:
     mid = (low + high) // 2
     if L[mid] == 0:
         high = mid
     else:
         low = mid + 1
list_of_X = L[:low]

Cheers,

Carl Friedrich Bolz