re.compile and very specific searches

Diez B. Roggisch deetsNOSPAM at web.de
Fri Feb 18 15:55:31 EST 2005


> The OP wanted to "find" IP addresses -- unclear whether re.search or
> re.match is required. Your solution doesn't address the search case.
> For the match case, it needs some augmentation. It will fall apart if
> presented with something like "..." or "comp.lang.python.announce". AND
> while I'm at it ... in the event of a valid string of digits, it will
> evaluate int(d) twice, rather unnecessarily & uglily.


You are right of course. I concentrated on the right value range, but bogus
entries should be dealt with, too.

> ! for s in strings_possibly_containing_digits:
> ! #   if not(s.isdigit() and 0 <= int(s) <= 255): # prettier, but test
> on zero is now redundant
> !     if not s.isdigit() or int(s) > 255:
> 

Instead of this, I'd go for 

def validate_ip4(address):
    digits = address.split(".")
    if len(digits) == 4:
        try:
            for d in digits:
                d = int(d)
                if d < 0 or d > 255:
                     return False
            return True
        except ValueError:
             pass
    return False

And I don't think that an isdigit() is necessary faster than int(). The
basically do the same. 

> and the search case: DON'T dump re; it can find highly probable
> candidates (using a regexp like the OP's original or yours) a damn
> sight faster than anything else this side of C or Pyrex. Then you
> validate the result, with a cut-down validator that relies on the fact
> that there are 4 segments and they contain only digits:

The search case needs a regular expression. But the OP didn't say much about
what he actually wants.

-- 
Regards,

Diez B. Roggisch



More information about the Python-list mailing list