Python too slow for real world

Fredrik Lundh fredrik at pythonware.com
Mon May 3 14:43:04 EDT 1999


Nathan Clegg <nathan at islanddata.com> wrote:
> I am interested in running a lot of text through dozens of different
> regexen (precompiled, of course).  However, I am interested in only
> whether or not each chunk of text passed which regexen.  I don't care
> about what matched and retrieving it--just a truth value.  Is there a way
> to get this without the overhead of the features I am not using?

here's one way to do it:

import re, string

patterns = [
    r"\d+",
    r"abc\d{2,4}",
    r"p\w+"
]

def combined_pattern(patterns):
    p = re.compile(
        string.join(map(lambda x: "("+x+")", patterns), "|")
        )
    def fixup(v, m=p.code.match, r=range(1,len(patterns)+1)):
        regs = m(v)
        try:
            for i in r:
                if regs[i] != (-1, -1):
                    return i-1
        except:
            return None # no match
    return fixup

p = combined_pattern(patterns)

# p returns the index of the matching
# pattern, or None

print p("129391")
print p("abc800")
print p("abc1600")
print p("python")
print p("perl")
print p("tcl")

</F>





More information about the Python-list mailing list