88k regex = RuntimeError

Tim N. van der Leeuw tim.leeuwvander at nl.unisys.com
Tue Feb 14 09:37:46 EST 2006


This is basically the same idea as what I tried to describe in my
previous post but without any samples.
I wonder if it's more efficient to create a new list using a
list-comprehension, and checking each entry against the 'wanted' set,
or to create a new set which is the intersection of set 'wanted' and
the iterable of all matches...

Your sample code would then look like this:

>>> import re
>>> r = re.compile(r"\w+")
>>> file_content = "foo bar-baz ignored foo()"
>>> wanted = set(["foo", "bar", "baz"])
>>> found = wanted.intersection(name for name in r.findall(file_content))
>>> print found
set(['baz', 'foo', 'bar'])
>>>

Anyone who has an idea what is faster? (This dataset is so limited that
it doesn't make sense to do any performance-tests with it)

Cheers,

--Tim




More information about the Python-list mailing list