Most efficient solution?
Bernhard Herzog
bh at intevation.de
Mon Jul 16 10:39:34 EDT 2001
alf at leo.logilab.fr (Alexandre Fayolle) writes:
> On Mon, 16 Jul 2001 09:19:09 -0400, Jay Parlar <jparlar at home.com> wrote:
> >List B consists of my "stopwords", meaning, the words I don't want included in my final version of list A. So what I need to
> >do is check every item in list A, and if it occurs in list B, then I want to remove it from the final version of A. My first thought
> >would be:
> >
> >for eachItem in A:
> > if eachItem in B:
> > A.remove(eachItem)
> >
>
> You may get some speedup by making B a dictionary, and using has_key() to
> see if the word is there. This should get you a O(log(n)) instead of O(n)
> inside the loop. To gain further performance, use filter to skim A.
>
> C = {}
> for item in B:
> C[item]=None
>
> A = filter(lambda e, dic = C: dic.has_key(e), A)
A bit more elegant, perhaps, and a little faster still would be to use 1
as the value in C and directly use C.get in filter:
C = {}
for item in B:
C[item] = 1
A = filter(C.get, A)
--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
MapIt! http://mapit.de/
More information about the Python-list
mailing list