regex matching question
John Machin
sjmachin at lexicon.net
Sat May 19 19:23:28 EDT 2007
On 20/05/2007 3:21 AM, bullockbefriending bard wrote:
> first, regex part:
>
> I am new to regexes and have come up with the following expression:
> ((1[0-4]|[1-9]),(1[0-4]|[1-9])/){5}(1[0-4]|[1-9]),(1[0-4]|[1-9])
>
> to exactly match strings which look like this:
>
> 1,2/3,4/5,6/7,8/9,10/11,12
>
> i.e. 6 comma-delimited pairs of integer numbers separated by the
> backslash character + constraint that numbers must be in range 1-14.
Backslash? Your example uses a [forward] slash.
Are you sure you don't want to allow for some spaces in the data, for
the benefit of the humans, e.g.
1,2 / 3,4 / 5,6 / 7,8 / 9,10 / 11,12
?
>
> i should add that i am only interested in finding exact matches (doing
> some kind of command line validation).
>
> this seems to work fine, although i would welcome any advice about how
> to shorten the above. it seems to me that there should exist some
> shorthand for (1[0-4]|[1-9]) once i have defined it once?
>
> also (and this is where my total beginner status brings me here
> looking for help :)) i would like to add one more constraint to the
> above regex. i want to match strings *iff* each pair of numbers are
> different. e.g: 1,1/3,4/5,6/7,8/9,10/11,12 or
> 1,2/3,4/5,6/7,8/9,10/12,12 should fail to be matched by my final
> regex whereas 1,2/3,4/5,6/7,8/9,10/11,12 should match OK.
>
> any tips would be much appreciated - especially regarding preceding
> paragraph!
>
> and now for the python part:
>
> results = "1,2/3,4/5,6/7,8/9,10/11,12"
> match = re.match("((1[0-4]|[1-9]),(1[0-4]|[1-9])/){5}(1[0-4]|[1-9]),
> (1[0-4]|[1-9])", results)
Always use "raw" strings for patterns, even if you don't have
backslashes in them -- and this one needs a backslash; see below.
For clarity, consider using "mobj" or even "m" instead of "match" to
name the result of re.match.
> if match == None or match.group(0) != results:
Instead of
if mobj == None ....
use
if mobj is None ...
or
if not mobj ...
Instead of the "or match.group(0) != results" caper, put \Z (*not* $) at
the end of your pattern:
mobj = re.match(r"pattern\Z", results)
if not mobj:
HTH,
John
More information about the Python-list
mailing list