Unable to make regular expression match on multiline files

Fredrik Lundh fredrik at pythonware.com
Thu Nov 15 18:15:25 EST 2001


David Lees wrote:
> I am trying to use a simple regular expression to extract some digits
> which are tagged with ascii text.  Everything works fine on a single
> line, but when I use text that has the '\n' character it fails.  Here is
> a sample.
>
> >>> p=re.compile('.*Number States: (\d+)',re.MULTILINE)

you want DOTALL, not MULTILINE:

>>> import re
>>> p = re.compile(".*Number States: (\d+)", re.DOTALL)
>>> a = " End: -1STATUS\nNumber States: 6\njunk"
>>> p.match(a)
<SRE_Match object at 008D6798>
>>> _.groups()
('6',)

> >>> a=' End: -1STATS:\nNumber States: 6\njunk'
> >>> m=p.match(a)
> >>> m.groups()

or better, use re.search instead of re.match:

>>> p = re.compile("Number States: (\d+)")
>>> p.search(a)
<SRE_Match object at 008D6AF8>
>>> _.groups()
('6',)

</F>

<!-- (the eff-bot guide to) the python standard library:
http://www.pythonware.com/people/fredrik/librarybook.htm
-->





More information about the Python-list mailing list