Issue with regular expressions

Matimus mccredie at gmail.com
Tue Apr 29 12:51:02 EDT 2008


On Apr 29, 6:46 am, Julien <jpha... at gmail.com> wrote:
> Hi,
>
> I'm fairly new in Python and I haven't used the regular expressions
> enough to be able to achieve what I want.
> I'd like to select terms in a string, so I can then do a search in my
> database.
>
> query = '   "  some words"  with and "without    quotes   "  '
> p = re.compile(magic_regular_expression)   $ <--- the magic happens
> m = p.match(query)
>
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
>
> Is that achievable with a single regular expression, and if so, what
> would it be?
>
> Any help would be much appreciated.
>
> Thanks!!
>
> Julien

I don't know if it is possible to do it all with one regex, but it
doesn't seem practical. I would check-out the shlex module.

>>> import shlex
>>>
>>> query = '   "  some words"  with and "without    quotes   "  '
>>> shlex.split(query)
['  some words', 'with', 'and', 'without    quotes   ']

To get rid of the leading and trailing space you can then use strip:

>>> [s.strip() for s in shlex.split(query)]
['some words', 'with', 'and', 'without    quotes']

The only problem is getting rid of the extra white-space in the middle
of the expression, for which re might still be a good solution.

>>> import re
>>> [re.sub(r"\s+", ' ', s.strip()) for s in shlex.split(query)]
['some words', 'with', 'and', 'without quotes']

Matt



More information about the Python-list mailing list