Issue with regular expressions

George Sakkis george.sakkis at gmail.com
Tue Apr 29 13:44:33 EDT 2008


On Apr 29, 9:46 am, Julien <jpha... at gmail.com> wrote:

> Hi,
>
> I'm fairly new in Python and I haven't used the regular expressions
> enough to be able to achieve what I want.
> I'd like to select terms in a string, so I can then do a search in my
> database.
>
> query = '   "  some words"  with and "without    quotes   "  '
> p = re.compile(magic_regular_expression)   $ <--- the magic happens
> m = p.match(query)
>
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
>
> Is that achievable with a single regular expression, and if so, what
> would it be?

As other replies mention, there is no single expression since you are
doing two things: find all matches and substitute extra spaces within
the quoted matches. It can be done with two expressions though:

def normquery(text, findterms=re.compile(r'"([^"]+)"|(\S+)').findall,
                    normspace=re.compile(r'\s{2,}').sub):
    return [normspace(' ', (t[0] or t[1]).strip()) for t in
findterms(text)]

>>> normquery('   "some words"  with and "without    quotes   "  ')
>>> ['some words', 'with', 'and', 'without quotes']


HTH,
George



More information about the Python-list mailing list