regular expression back references

Clay Shirky clay at shirky.com
Fri Aug 8 21:56:30 EDT 2003


ruach at chpc.utah.edu (Matthew) wrote in message news:<ec1162c7.0308081412.4c0d2fca at posting.google.com>...

> Here is my patteren:
> 
> macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:

good lord, that looks like perl.

that sort of thing is miserable to write and miserable to maintain. it
makes more sense to treat MAC addresses as numbers than strings (and
saves you the horror of upper/lower case and "is it 0 or 00?" issues
as well)

use the re moduel to figure out what to split on, then convert
everything to numeric comparisons. here's an example, more readable
than the macExpression above:

import re 

orig_list = [ 0, 160, 201, 238, 178, 192 ] # test MAC as numbers

new_addresses = [ "00:30:65:01:dc:9f", # various formats...
                  "00-03-93-52-0c-c6",
                  "00.A0.C9.EE.B2.C0" ]

for new_address in new_addresses:
    
    test_list = []
    
    # use regexes to see what to split on
    if re.search(":", new_address):
        new_list = new_address.split(":")
    elif re.search("-", new_address):
        new_list = new_address.split("-")
    elif re.search(".", new_address):
        new_list = new_address.split(".")
    
    # convert alphanumeric hex strings to numbers
    # via a long() cast, in base 16
    for two_byte in new_list:
        test_list.append(long(two_byte, 16)) # make a test list
    
    if test_list == orig_list: # check for numeric matches
        print new_address, "matches..."
    else:
        print new_address, "doesn't match..."




More information about the Python-list mailing list