re question

Matt McCredie mccredie at gmail.com
Thu Sep 20 17:38:59 EDT 2007


On 9/19/07, Dan Bar Dov <bardov at gmail.com> wrote:
> I'm trying to construct a regular expression to match valid IP address,
> without leading zeroes (i.e
> 1.2.3.4, 254.10.0.0, but not 324.1.1.1, nor 010.10.10.1)
>
> This is what I come up with, and it does not work.
>
> r'(^[12]?\d{0,2}\.){3,3}[12]?\d{0,2}'
>
> What am I doing wrong?

I'm not sure what affect having the "^" inside of the parens will
have, but it surely isn't what you want.

This part: r"[12]?\d{0,2}" will match the following strings, which I'm
sure you dont' want:

"" - yes it will match an empty string (Is "..." a valid IP?)
"00" - It could start with a 0, as long as there are only two characters
"299" - A little outside of the range you are interested in

That {3,3} is better written as {3}.

> Any common knowledge IP matching RE?

I don't know if there is any common knowledge RE, but I came up with
the following:

r"((1\d{2}|2[0-4]\d|25[0-5]|[1-9]\d|\d)\.){3}(1\d{2}|2[0-4]\d|25[0-5]|[1-9]\d|\d)")

Let us break it down:

This matches an octet:
    r"(1\d{2}|2[0-5]\d|[1-9]\d|\d)"

Which will match any ONE of the following
    1\d{2}  - A "1" followed by any two digits
    2[0-4]\d - A "2" followed by 0,1,2,3 or 4 followed by any digit
    25[0-5] - A "25" followed by 0,1,2,3,4 or 5
    [1-9]\d - Any digit but 0 followed by any digit
    \d - Any Digit

I generally discourage people from using REs. I think the folowing is
much easier to read:

def isip(x):
    octs = x.split(".")
    if len(octs) != 4:
        return False
    for oct in octs:
        if len(oct) > 1 and oct[0] == "0":
            return False
        try:
            if not 0 <= int(oct) < 256:
                return False
        except ValueError:
            return False
    return True

Both solutions seem to work, though I used a small set of test cases.
Others may have better suggestions.

Matt



More information about the Python-list mailing list