Regex on a huge text

John Machin sjmachin at lexicon.net
Fri Aug 22 17:56:51 EDT 2008


On Aug 23, 6:19 am, "Medardo Rodriguez" <med.... at gmail.com> wrote:
> On Fri, Aug 22, 2008 at 11:24 AM, Dan <redalas... at gmail.com> wrote:
> > I'm looking on how to apply a regex on a pretty huge input text (a file
> > that's a couple of gigabytes). I found finditer which would return results
> > iteratively which is good but it looks like I still need to send a string
> > which would be bigger than my RAM. Is there a way to apply a regex directly
> > on a file?
>
> > Any help would be appreciated.
>
> You can call *grep* posix utility.
> But if the regex's matches are possible only inner the context of a
> line of that file:
> #<code>
(snip)
> #</code>

Docs:
"""
mmap — Memory-mapped file support

Memory-mapped file objects behave like both strings and like file
objects. Unlike normal string objects, however, these are mutable. You
can use mmap objects in most places where strings are expected; for
example, you can use the re module to search through a memory-mapped
file.
"""




More information about the Python-list mailing list