regex matching question

John Machin sjmachin at lexicon.net
Sat May 19 19:23:28 EDT 2007


On 20/05/2007 3:21 AM, bullockbefriending bard wrote:
> first, regex part:
> 
> I am new to regexes and have come up with the following expression:
>      ((1[0-4]|[1-9]),(1[0-4]|[1-9])/){5}(1[0-4]|[1-9]),(1[0-4]|[1-9])
> 
> to exactly match strings which look like this:
> 
>      1,2/3,4/5,6/7,8/9,10/11,12
> 
> i.e. 6 comma-delimited pairs of integer numbers separated by the
> backslash character + constraint that numbers must be in range 1-14.

Backslash? Your example uses a [forward] slash.

Are you sure you don't want to allow for some spaces in the data, for 
the benefit of the humans, e.g.
    1,2 / 3,4 / 5,6 / 7,8 / 9,10 / 11,12
?

> 
> i should add that i am only interested in finding exact matches (doing
> some kind of command line validation).
> 
> this seems to work fine, although i would welcome any advice about how
> to shorten the above. it seems to me that there should exist some
> shorthand for (1[0-4]|[1-9]) once i have defined it once?
> 
> also (and this is where my total beginner status brings me here
> looking for help :)) i would like to add one more constraint to the
> above regex. i want to match strings *iff* each pair of numbers are
> different. e.g: 1,1/3,4/5,6/7,8/9,10/11,12 or
> 1,2/3,4/5,6/7,8/9,10/12,12 should fail to be matched  by my final
> regex whereas 1,2/3,4/5,6/7,8/9,10/11,12 should match OK.
> 
> any tips would be much appreciated - especially regarding preceding
> paragraph!
> 
> and now for the python part:
> 
> results = "1,2/3,4/5,6/7,8/9,10/11,12"
> match = re.match("((1[0-4]|[1-9]),(1[0-4]|[1-9])/){5}(1[0-4]|[1-9]),
> (1[0-4]|[1-9])", results)

Always use "raw" strings for patterns, even if you don't have 
backslashes in them -- and this one needs a backslash; see below.

For clarity, consider using "mobj" or even "m" instead of "match" to 
name the result of re.match.


> if match == None or match.group(0) != results:

Instead of
    if mobj == None ....
use
    if mobj is None ...
or
    if not mobj ...

Instead of the "or match.group(0) != results" caper, put \Z (*not* $) at 
the end of your pattern:
    mobj = re.match(r"pattern\Z", results)
    if not mobj:


HTH,
John



More information about the Python-list mailing list