re.compile and very specific searches

rbt rbt at athop1.ath.vt.edu
Fri Feb 18 17:44:37 EST 2005


John Machin wrote:
> Diez B. Roggisch wrote:
> 
> 
> 
>>So I'd suggest you dump re and do it like this:
>>
>>address = "192.168.1.1"
>>
>>def validate_ip4(address):
>>    digits = address.split(".")
>>    if len(digits) == 4:
>>        for d in digits:
>>            if int(d) < 0 or int(d) > 255:
>>                  return False
>>    return True
>>
> 
> 
> The OP wanted to "find" IP addresses -- unclear whether re.search or
> re.match is required. Your solution doesn't address the search case.
> For the match case, it needs some augmentation. It will fall apart if
> presented with something like "..." or "comp.lang.python.announce". AND
> while I'm at it ... in the event of a valid string of digits, it will
> evaluate int(d) twice, rather unnecessarily & uglily.
> 
> So: match case:
> 
> ! for s in strings_possibly_containing_digits:
> ! #   if not(s.isdigit() and 0 <= int(s) <= 255): # prettier, but test
> on zero is now redundant
> !     if not s.isdigit() or int(s) > 255:
> 
> and the search case: DON'T dump re; it can find highly probable
> candidates (using a regexp like the OP's original or yours) a damn
> sight faster than anything else this side of C or Pyrex. Then you
> validate the result, with a cut-down validator that relies on the fact
> that there are 4 segments and they contain only digits:

This is what I ended up doing... re.compile and then findall(data) does an excellent 
job finding all strings that look like ipv4 addys, then the split works just as well 
in weeding out strings that are not actual ipv4 addys.

Thanks to all for the advice!



More information about the Python-list mailing list