is there a better way?
Alex Martelli
aleaxit at yahoo.com
Sat Feb 11 11:39:30 EST 2006
Charles Krug <cdkrug at aol.com> wrote:
> On 2006-02-11, Alex Martelli <aleaxit at yahoo.com> wrote:
> > markscala at gmail.com <markscala at gmail.com> wrote:
> >
> >> Problem:
> >>
> >> You have a list of unknown length, such as this: list =
> >> [X,X,X,O,O,O,O]. You want to extract all and only the X's. You know
> >> the X's are all up front and you know that the item after the last X is
> >> an O, or that the list ends with an X. There are never O's between
> >> X's.
> >
> > If the list is incredibly long, you should use a bisection approach.
> > Standard module bisect in the Python library could help, but mostly as
> > an _example_, since it unfortunately relies on normal alphabetic order,
> > and alphabetically speaking X should come _after_ O, not _before_.
>
> Isn't every call to list.index() an O(n) operation? We certainly want
> to avoid multiple calls there if we can.
Why should we have ANY call to index whatsoever?
> What happens if your split occurs in the middle of your block of Xs?
> Then the split before/after fails --the stated algorithm says, "If the
> split is an X, choose the front half," so perhaps the statement was
> inprecise?
What statement? What are you TALKING about?! What I said, and you
don't quote, was:
> > 2. if it's an X, so are all previous items -- recurse to second half
how can you POSSIBLY read this as "choose the front half"?! I say to
recurse (iteration works, too, but it's even trickier to code) to the
SECOND half, to find the first-if-any non-'X'.
> The only way you'll know if you have an X in a particular block is using
> a linear search method, either in Python or with list.index()
Reread markscala's problem statement: all the Xs are up front followed
by 0 or more Os. So, the list is L = ['X']*N + ['O']*M for unknown N>=0
and M>=0. All we have to do is find N and M (if we know either, the
other is immediately computed, since N+M==len(L) and len() is O(1)).
> If (as the OP seemed to state) we KNOW that there's only one block of
> X's in the list:
>
> 1. Find the index of the first X
Why would we do that? He stated, very clearly, and you and I have both
been quoting:
> >> You know
> >> the X's are all up front
So why would we do any work to find out what we altrady know?
> 2. Find the index of the last X.
Yep, the only certain task, and it's O(log(N+M)).
> 3. Delete the block we've found.
And the deletion is definitely linear time (and trivial), of course. I
was focusing on the only interesting part, (2).
> Judging from way the OP worded the question, I'd advise making something
> that works and that you understand how it works.
>
> After that, s/he can worry about whether or not its performance is
> suboptimal.
And indeed, part of what I said (and again you're snipping it rather
than quoting it was:
> > If N is not too huge, O(N) might be OK, and is, of course, way simpler
> > to code!-)
However, even though the O(N) in the deletion subtask would appear to
justify this shortcut, I think the task is way too trivial to justify a
linear-time approach to point 2 -- the obvious N = L.count('X'), of
course. It seems likely that the whole purpose of the exercise
(probably homework) is to have the student identify and develop a
bisection (a notoriously tricky-to-code thing).
Alex
More information about the Python-list
mailing list