One liners

Roy Smith roy at panix.com
Fri Dec 6 19:56:54 EST 2013


In article <mailman.3682.1386376799.18130.python-list at python.org>,
 Joel Goldstick <joel.goldstick at gmail.com> wrote:

> Aside from django urls, I am not sure I ever wrote regexes in python.  For
> some reason they must seem awfully sexy to quite a few people.  Back to my
> point above -- ever try to figure out a complicated regex written by
> someone else?

Regex has a bad rap in the Python community.  To be sure, you can abuse 
them, and write horrible monstrosities.  On the other hand, stuff like 
this (slightly reformatted for posting):

pattern = re.compile(
        r'haproxy\[(?P<pid>\d+)]: '
        r'(?P<client_ip>(\d{1,3}\.){3}\d{1,3}):'
        r'(?P<client_port>\d{1,5}) '
        r'\[(?P<accept_date>\d{2}/\w{3}/\d{4}(:\d{2}){3}\.\d{3})] '
        r'(?P<frontend_name>\S+) '
        r'(?P<backend_name>\S+)/'
        r'(?P<server_name>\S+) '
        r'(?P<Tq>(-1|\d+))/'
        r'(?P<Tw>(-1|\d+))/'
        r'(?P<Tc>(-1|\d+))/'
        r'(?P<Tr>(-1|\d+))/'
        r'(?P<Tt>\+?\d+) '
        r'(?P<status_code>\d{3}) '
        r'(?P<bytes_read>\d+) '
        r'(?P<captured_request_cookie>\S+) '
        r'(?P<captured_response_cookie>\S+) '
        r'(?P<termination_state>[\w-]{4}) '
        r'(?P<actconn>\d+)/'
        r'(?P<feconn>\d+)/'
        r'(?P<beconn>\d+)/'
        r'(?P<srv_conn>\d+)/'
        r'(?P<retries>\d+) '
        r'(?P<srv_queue>\d+)/'
        r'(?P<backend_queue>\d+) '
        r'(\{(?P<request_id>.*?)\} )?'   # Comment out for stock haproxy
        r'(\{(?P<captured_request_headers>.*?)\} )?'
        r'(\{(?P<captured_response_headers>.*?)\} )?'
        r'"(?P<http_request>.+)"'
        )

while intimidating at first glance, really isn't that hard to 
understand.  Python's raw string literals, adjacent string literal 
catenation, and automatic line continuation team up to eliminate a lot 
of extra fluff.



More information about the Python-list mailing list