re.search (works)|(doesn't work) depending on for loop order

John Machin sjmachin at lexicon.net
Sat Mar 22 18:58:09 EDT 2008


On Mar 23, 7:27 am, sgharvey <KephnosAnagen... at gmail.com> wrote:
> ... and by works, I mean works like I expect it to.

You haven't told us what you expect it to do. In any case, your
subject heading indicates that the problem is 99.999% likely to be in
your logic -- the converse would require the result of re.compile() to
retain some memory of what it's seen before *AND* to act differently
depending somehow on those memorised facts.

>
> I'm writing my own cheesy config.ini parser because ConfigParser
> doesn't preserve case or order of sections, or order of options w/in
> sections.
>
> What's confusing me is this:
>    If I try matching every line to one pattern at a time, all the
> patterns that are supposed to match, actually match.
>    If I try to match every pattern to one line at a time, only one
> pattern will match.
>
> What am I not understanding about re.search?

Its behaviour is not contingent on previous input.

The following pseudocode is not very useful; the corrections I have
made below can be made only after reading the actual pastebin code :-
( ... you are using the name "pattern" to refer both to a pattern name
(e.g. 'setting') and to a compiled regex.

> Doesn't match properly:
> <code>
>       # Iterate through each pattern for each line
>       for line in lines:
>          for pattern in patterns:

you mean: for pattern_name in pattern_names:

>             # Match each pattern to the current line
>             match = patterns[pattern].search(line)

you mean: match = compiled_regexes[pattern_name].search(line)

>             if match:
>                "%s: %s" % (pattern, str(match.groups()) )

you mean: print pattern_name, match.groups
> </code>
>
> _Does_ match properly:
> <code>
[snip]

> </code>
>
> Related code:
> The whole src http://pastebin.com/f63298772

This can't be the code that you ran, because it won't even compile.
See comments in my update at http://pastebin.com/m77f0617a

By the way, you should be either (a) using *match* (not search) with a
\Z at the end of each pattern or (b) checking that there is not
extraneous guff at the end of the line ... otherwise a line like
"[blah] waffle" would be classified as a "section".

Have you considered leading/trailing/embedded spaces?

> regexen and delimiters (imported into whole src) http://pastebin.com/f485ac180

HTH,
John



More information about the Python-list mailing list