Issue with regular expressions

Robert Bossy Robert.Bossy at jouy.inra.fr
Tue Apr 29 10:40:43 EDT 2008


Julien wrote:
> Hi,
>
> I'm fairly new in Python and I haven't used the regular expressions
> enough to be able to achieve what I want.
> I'd like to select terms in a string, so I can then do a search in my
> database.
>
> query = '   "  some words"  with and "without    quotes   "  '
> p = re.compile(magic_regular_expression)   $ <--- the magic happens
> m = p.match(query)
>
> I'd like m.groups() to return:
> ('some words', 'with', 'and', 'without quotes')
>
> Is that achievable with a single regular expression, and if so, what
> would it be?
>
> Any help would be much appreciated.
>   
Hi,

I think re is not the best tool for you. Maybe there's a regular 
expression that does what you want but it will be quite complex and hard 
to maintain.

I suggest you split the query with the double quotes and process 
alternate inside/outside chunks. Something like:

import re

def spulit(s):
    inq = False
    for term in s.split('"'):
        if inq:
            yield re.sub('\s+', ' ', term.strip())
        else:
            for word in term.split():
                yield word
        inq = not inq

for token in spulit('   "  some words"  with and "without    quotes   "  '):
    print token
   
  
Cheers,
RB



More information about the Python-list mailing list