a list/re problem

Tim Chase python.list at tim.thechases.com
Fri Dec 11 16:05:36 EST 2009


> l = ['asc', '*nbh*', 'jlsdjfdk', 'ikjh', '*jkjsdfjasd*', 'rewr']
>
> Notice that some of the items in the list start and end with an '*'. I wish to construct a new list, call it 'n' which is all the members of l that start and end with '*', with the '*'s removed.
>
> So in the case above n would be ['nbh', 'jkjsdfjasd']
>
> the following works:
>
> r = re.compile('\*(.+)\*')
>
> def f(s):
>      m = r.match(s)
>      if m:
>          return m.group(1)
>      else:
>          return ''
> 	
> n =  [f(x) for x in l if r.match(x)]
>
> But it is inefficient, because it is matching the regex twice for each item, and it is a bit ugly.

You can skip the function by writing that as

   n = [r.match(s).group(1) for s in l if r.match(s)]

but it doesn't solve your match-twice problem.

I'd skip regexps completely and do something like

   n = [s[1:-1] for s in l
        if s.startswith('*')
        and s.endswith('*')
        ]

And this is coming from a guy that tends to overuse regexps :)

-tkc





More information about the Python-list mailing list