is there a better way?

Sat Feb 11 01:58:30 EST 2006

markscala at gmail.com <markscala at gmail.com> wrote:

> Problem:
> 
> You have a list of unknown length, such as this: list =
> [X,X,X,O,O,O,O].  You want to extract all and only the X's.  You know
> the X's are all up front and you know that the item after the last X is
> an O, or that the list ends with an X.  There are never O's between
> X's.

If the list is incredibly long, you should use a bisection approach.
Standard module bisect in the Python library could help, but mostly as
an _example_, since it unfortunately relies on normal alphabetic order,
and alphabetically speaking X should come _after_ O, not _before_.

But the algorithm is still sound:

1. look at the midpoint.
2. if it's an X, so are all previous items -- recurse to second half
3. if it's an O, so are all following items -- recurse to first half

Getting all conditions exactly right is tricky (which is why bisect is a
good model!), but this way you get O(log N) performance for a list of
length N.

If N is not too huge, O(N) might be OK, and is, of course, way simpler
to code!-)

Alex