Issue with regular expressions

Paul Melis paul at science.uva.nl
Tue Apr 29 10:30:09 EDT 2008


Julien wrote:
> Hi,
> 
> I'm fairly new in Python and I haven't used the regular expressions
> enough to be able to achieve what I want.
> I'd like to select terms in a string, so I can then do a search in my
> database.
> 
> query = '   "  some words"  with and "without    quotes   "  '
> p = re.compile(magic_regular_expression)   $ <--- the magic happens
> m = p.match(query)
> 
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
> 
> Is that achievable with a single regular expression, and if so, what
> would it be?

Here's one way with a single regexp plus an extra filter function.

 >>> import re
 >>> p = re.compile('("([^"]+)")|([^ \t]+)')
 >>> m = p.findall(q)
 >>> m
[('"  some words"', '  some words', ''), ('', '', 'with'), ('', '', 
'and'), ('"without    quotes   "', 'without    quotes   ', '')]
 >>> def f(t):
...     if t[0] == '':
...             return t[2]
...     else:
...             return t[1]
...
 >>> map(f, m)
['  some words', 'with', 'and', 'without    quotes   ']

If you want to strip away the leading/trailing whitespace from the 
quoted strings, then change the last return statement to
be "return t[1].strip()".

Paul



More information about the Python-list mailing list