Using regular expressions to extract substrings from files

Jason Lai jmlai at uci.edu
Thu Sep 9 22:42:25 EDT 2004


Timothy Hume wrote:
> Hi,
> 
> I am new to Python, and was wondering if it is possible to operate on 
> files using regular expressions.
> 
> What I mean is this:
> - It is easy to search for a substring of a string using regular 
>   expressions
> - Can I also search for a substring inside a file using regular 
>   expressions? The substring may span several lines (ie there may be 
>   embedded new line and carriage return characters).
> 
> So far, the only way I know how to do this is to read the entire file into 
> a string, and then parse the resulting string with regular expressions. 
> This is OK for small files (in fact it is probably quite efficient, 
> because the disc I/O is done all at once). However, once the files get 
> large, there is the risk I will run out of memory. The closest UNIX tool I 
> can think of to do this sort of job is grep, but that doesn't have the 
> power and flexibility of Python.
> 
> Any ideas would be appreciated.
> 
> Tim Hume
> Bureau of Meteorology Research Centre
> Melbourne
> Australia
> 

http://docs.python.org/lib/module-mmap.html



More information about the Python-list mailing list