using re: hitting recursion limit
Helmut Jarausch
jarausch at skynet.be
Wed Oct 27 10:15:51 EDT 2004
Erik Johnson wrote:
> I have done a fair amount of regular expression text processing in Perl,
> and am currently trying to convert a running Perl script into Python (for a
> number of reasons I won't go into here). I have not had any problems with
> memory limits using Perl, but in trying to clip out a particular table from
> a web page, I am hitting Python's recursion limit.
>
> The RE is pretty simple:
>
> pat = '(<table.*?%s.*?</table>)' % magic_string
>
> This seems about as simple as a "real-world" RE can get. If I cut down
> the web page to about 150 lines, this works, but that's not practical - the
> table I need to parse can easily gro to over 1000 lines. I found the
> following bit in the reference manual from section 4.2.6:
You can do without REs. There is an (even non-recursive) version of
mxTextTools see http://simpleparse.sourceforge.net/
and it's really fast.
--
Helmut Jarausch
Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany
More information about the Python-list
mailing list