Multiline regex help

Yatima yatima_ at konishi.polis.net
Thu Mar 3 16:37:20 EST 2005


On Thu, 03 Mar 2005 13:45:31 -0700, Steven Bethard <steven.bethard at gmail.com> wrote:
>
> I think if you use the non-greedy .*? instead of the greedy .*, you'll 
> get this behavior.  For example:
>
> py> s = """\
> ... Gibberish
> ... 53
> ... MoreGarbage
> [snip a whole bunch of stuff]
> ... RelevantInfo3
> ... 60
> ... Lalala
> ... """
> py> import re
> py> m = re.compile(r"""^RelevantInfo1\n([^\n]*)
> ...                    .*?
> ...                    ^RelevantInfo2\n([^\n]*)
> ...                    .*?
> ...                    ^RelevantInfo3\n([^\n]*)""",
> ...                re.DOTALL | re.MULTILINE | re.VERBOSE)
> py> score = {}
> py> for info1, info2, info3 in m.findall(s):
> ...     score.setdefault(info1, {})[info3] = info2
> ...
> py> score
> {'10/10/04': {'44': '33', '23': '22'}, '10/11/04': {'60': '45'}}
>
> If you might have multiple info2 values for the same (info1, info3) 
> pair, you can try something like:
>
> py> score = {}
> py> for info1, info2, info3 in m.findall(s):
> ...     score.setdefault(info1, {}).setdefault(info3, []).append(info2)
> ...
> py> score
> {'10/10/04': {'44': ['33'], '23': ['22']}, '10/11/04': {'60': ['45']}}
>
Perfect! Thank you so much. This is the behaviour I'm looking for. I will
fiddle around with this some more tonight but the rest should be okay.

Take care.

-- 
Of course power tools and alcohol don't mix.  Everyone knows power
tools aren't soluble in alcohol...
		-- Crazy Nigel



More information about the Python-list mailing list