Regex on a huge text

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Sun Aug 24 01:46:29 EDT 2008


En Fri, 22 Aug 2008 18:56:51 -0300, John Machin <sjmachin at lexicon.net> escribió:
> On Aug 23, 6:19 am, "Medardo Rodriguez" <med.... at gmail.com> wrote:
>> On Fri, Aug 22, 2008 at 11:24 AM, Dan <redalas... at gmail.com> wrote:
>> > I'm looking on how to apply a regex on a pretty huge input text (a file
>> > that's a couple of gigabytes). I found finditer which would return results
>> > iteratively which is good but it looks like I still need to send a string
>> > which would be bigger than my RAM. Is there a way to apply a regex directly
>> > on a file?
>
> Docs:
> """
> mmap — Memory-mapped file support
>
> Memory-mapped file objects behave like both strings and like file
> objects. Unlike normal string objects, however, these are mutable. You
> can use mmap objects in most places where strings are expected; for
> example, you can use the re module to search through a memory-mapped
> file.
> """

Still limited to virtual memory address range for user processes, 2GB or 3GB depending on the OS (assuming a 32 bits OS).

-- 
Gabriel Genellina




More information about the Python-list mailing list