Issue with regular expressions
Paul Melis
paul at science.uva.nl
Tue Apr 29 10:30:09 EDT 2008
Julien wrote:
> Hi,
>
> I'm fairly new in Python and I haven't used the regular expressions
> enough to be able to achieve what I want.
> I'd like to select terms in a string, so I can then do a search in my
> database.
>
> query = ' " some words" with and "without quotes " '
> p = re.compile(magic_regular_expression) $ <--- the magic happens
> m = p.match(query)
>
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
>
> Is that achievable with a single regular expression, and if so, what
> would it be?
Here's one way with a single regexp plus an extra filter function.
>>> import re
>>> p = re.compile('("([^"]+)")|([^ \t]+)')
>>> m = p.findall(q)
>>> m
[('" some words"', ' some words', ''), ('', '', 'with'), ('', '',
'and'), ('"without quotes "', 'without quotes ', '')]
>>> def f(t):
... if t[0] == '':
... return t[2]
... else:
... return t[1]
...
>>> map(f, m)
[' some words', 'with', 'and', 'without quotes ']
If you want to strip away the leading/trailing whitespace from the
quoted strings, then change the last return statement to
be "return t[1].strip()".
Paul
More information about the Python-list
mailing list