using re: hitting recursion limit

Helmut Jarausch jarausch at skynet.be
Wed Oct 27 10:15:51 EDT 2004


Erik Johnson wrote:
>     I have done a fair amount of regular expression text processing in Perl,
> and am currently trying to convert a running Perl script into Python (for a
> number of reasons I won't go into here). I have not had any problems with
> memory limits using Perl, but in trying to clip out a particular table from
> a web page, I am hitting Python's recursion limit.
> 
> The RE is pretty simple:
> 
> pat = '(<table.*?%s.*?</table>)' % magic_string
> 
>     This seems about as simple as a "real-world" RE can get. If I cut down
> the web page to about 150 lines, this works, but that's not practical - the
> table I need to parse can easily gro to over 1000 lines.  I found the
> following bit in the reference manual from section 4.2.6:

You can do without REs. There is an (even non-recursive) version of
mxTextTools  see http://simpleparse.sourceforge.net/
and it's really fast.

-- 
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany



More information about the Python-list mailing list