88k regex = RuntimeError

jodawi jodawi.spamtrap at gmail.com
Tue Feb 14 02:07:53 EST 2006


I need to find a bunch of C function declarations by searching
thousands of source or html files for thousands of known function
names. My initial simple approach was to do this:

rxAllSupported = re.compile(r"\b(" + "|".join(gAllSupported) + r")\b")
# giving a regex of   \b(AAFoo|ABFoo|   (uh... 88kb more...)   |zFoo)\b

for root, dirs, files in os.walk( ... ):
...
    for fileName in files:
...
        filePath = os.path.join(root, fileName)
        file = open(filePath, "r")
        contents = file.read()
...
        result = re.search(rxAllSupported, contents)

but this happens:

    result = re.search(rxAllSupported, contents)
  File "C:\Python24\Lib\sre.py", line 134, in search
    return _compile(pattern, flags).search(string)
RuntimeError: internal error in regular expression engine

I assume it's hitting some limit, but don't know where the limit is to
remove it. I tried stepping into it repeatedly with Komodo, but didn't
see the problem.

Suggestions?




More information about the Python-list mailing list