Regex on a huge text

Medardo Rodriguez med.swl at gmail.com
Fri Aug 22 16:19:22 EDT 2008


On Fri, Aug 22, 2008 at 11:24 AM, Dan <redalastor at gmail.com> wrote:
> I'm looking on how to apply a regex on a pretty huge input text (a file
> that's a couple of gigabytes). I found finditer which would return results
> iteratively which is good but it looks like I still need to send a string
> which would be bigger than my RAM. Is there a way to apply a regex directly
> on a file?
>
> Any help would be appreciated.


You can call *grep* posix utility.
But if the regex's matches are possible only inner the context of a
line of that file:
#<code>
res = []
with file(filename) as f:
    for line in f:
        res.extend(getmatches(regex, line))
#  Of course "getmatches" describes the concept.
#</code>

Regards



More information about the Python-list mailing list