Can one make 'in' ungreedy?

Chris Green cl at isbd.net
Mon May 18 09:43:48 EDT 2020


Larry Martell <larry.martell at gmail.com> wrote:
> On Mon, May 18, 2020 at 7:05 AM Chris Green <cl at isbd.net> wrote:
> >
> > I have a strange/minor problem in a Python program I use for mail
> > filtering.
> >
> > One of the ways it classifies messages is by searching for a specific
> > string in square brackets [] in the Subject:, the section of code that
> > does this is:-
> >
> >     #
> >     #
> >     # copy the fields from the filter configuration file into better named variables
> >     #
> >     nm = fld[0]             # name/alias
> >     dd = fld[1] + "/"       # destination directory
> >     tocc = fld[2].lower()   # list address
> >     sbstrip = '[' + fld[3] + ']'        # string to match in and/or strip out of subject
> >     #
> >     #
> >     # see if the filter To/CC column matches the message To: or Cc: or if sbstrip is in Subject:
> >     #
> >     if (tocc in msgcc or tocc in msgto or sbstrip in msgsb):
> >         #
> >         #
> >         # set the destination directory
> >         #
> >         dest = mldir + dd + nm
> >         #
> >         #
> >         # Strip out list name (4th field) from subject if it's there
> >         #
> >         if sbstrip in msgsb:
> >             msg.replace_header("Subject", msgsb.replace(sbstrip, ''))
> >         #
> >         #
> >         # we've found a match so assume we won't get another
> >         #
> >         break
> >
> >
> > So in the particular case where I have a problem sbstrip is "[Ipswich
> > Recycle]" and the Subject: is "[SPAM] [Ipswich Recycle] OFFER:
> > Lawnmower (IP11)".  The match isn't found, presumably because 'in' is
> > greedy and sees "[SPAM] [Ipswich Recycle]" which isn't a match for
> > "[Ipswich Recycle]".
> >
> > Other messages with "[Ipswich Recycle]" in the Subject: are being
> > found and filtered correctly, it seems that it's the presence of the
> > "[SPAM]" in the Subject: that's breaking things.
> >
> > Is this how 'in' should work, it seems a little strange if so, not
> > intuitively how one would expect 'in' to work.  ... and is there any
> > way round the issue except by recoding a separate test for the
> > particular string search where this can happen?
> 
> >>> sbstrip = "[Ipswich Recycle]"
> >>> subject = "[SPAM] [Ipswich Recycle] OFFER:Lawnmower (IP11)"
> >>> sbstrip in subject
> True
> 
> Clearly something else is going on in your program. I would run it in
> the debugger and look at the values of the variables in the case when
> it fails when you think it should succeed. I think you will see the
> variables do not hold what you think they do.

Absolutely right!  It wasn't even this program had the problem, the
odd messages in the wrong place were arriving via my 'catchall' mail
filter which deposited stuff by an entirely different route into my
inbox!  Typical - looking for the bug in the wrong program. :-)

-- 
Chris Green
·


More information about the Python-list mailing list