regex: multiple matching for one string

Scott David Daniels Scott.Daniels at Acm.Org
Fri Jul 24 11:54:32 EDT 2009


rurpy at yahoo.com wrote:
> Nick Dumas wrote:
>> On 7/23/2009 9:23 AM, Mark Lawrence wrote:
>>> scriptlearner at gmail.com wrote:
>>>> For example, I have a string "#a=valuea;b=valueb;c=valuec;", and I
>>>> will like to take out the values (valuea, valueb, and valuec).  How do
>>>> I do that in Python?  The group method will only return the matched
>>>> part.  Thanks.
>>>>
>>>> p = re.compile('#a=*;b=*;c=*;')
>>>> m = p.match(line)
>>>>         if m:
>>>>              print m.group(),
>>> IMHO a regex for this is overkill, a combination of string methods such
>>> as split and find should suffice.
> 
> You're saying that something like the following
> is better than the simple regex used by the OP?
> [untested]
> values = []
> parts = line.split(';')
> if len(parts) != 4: raise SomeError()
> for p, expected in zip (parts[-1], ('#a','b','c')):
>     name, x, value = p.partition ('=')
>     if name != expected or x != '=':
>         raise SomeError()
>     values.append (value)
> print values[0], values[1], values[2]
I call straw man: [tested]
     line = "#a=valuea;b=valueb;c=valuec;"
     d = dict(single.split('=', 1)
              for single in line.split(';') if single)
     d['#a'], d['b'], d['c']
If you want checking code, add:
     if len(d) != 3:
         raise ValueError('Too many keys: %s in %r)' % (
                              sorted(d), line))

> Blech, not in my book.  The regex checks the
> format of the string, extracts the values, and
> does so very clearly.  Further, it is easily
> adapted to other similar formats, or evolutionary
> changes in format.  It is also (once one is
> familiar with regexes -- a useful skill outside
> of Python too) easier to get right (at least in
> a simple case like this.)
The posted regex doesn't work; this might be homework, so
I'll not fix the two problems.  The fact that you did not
see the failure weakens your claim of "does so very clearly."

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list