Parsing a search string

Freddie lion-freddie at zebra-madcowdisease.giraffe-org
Fri Dec 31 20:05:46 EST 2004


Reinhold Birkenfeld wrote:
> Freddie wrote:
> 
>>Happy new year! Since I have run out of alcohol, I'll ask a question that I 
>>haven't really worked out an answer for yet. Is there an elegant way to turn 
>>something like:
>>
>> > moo cow "farmer john" -zug
>>
>>into:
>>
>>['moo', 'cow', 'farmer john'], ['zug']
>>
>>I'm trying to parse a search string so I can use it for SQL WHERE constraints, 
>>preferably without horrifying regular expressions. Uhh yeah.
> 
> 
> The shlex approach, finished:
> 
> searchstring = 'moo cow "farmer john" -zug'
> lexer = shlex.shlex(searchstring)
> lexer.wordchars += '-'
> poslist, neglist = [], []
> while 1:
>     token = lexer.get_token()
>     # token is '' on eof
>     if not token: break
>     # remove quotes
>     if token[0] in '"\'':
>         token = token[1:-1]
>     # select in which list to put it
>     if token[0] == '-':
>         neglist.append(token[1:])
>     else:
>         poslist.append(token)
> 
> regards,
> Reinhold

Thanks for this, though there was one issue:

 >>> lexer = shlex.shlex('moo cow +"farmer john" -dog')
 >>> lexer.wordchars += '-+'
 >>> while 1:
...     tok = lexer.get_token()
...     if not tok: break
...     print tok
...
moo
cow
+"farmer
john"
-dog

The '+"farmer john"' part would be turned into two seperate words, '+"farmer' 
and 'john"'. I ended up using shlex.split() (which the docs say is new in 
Python 2.3), which gives me the desired result. Thanks for the help from 
yourself and M.E.Farmer :)

Freddie

 >>> shlex.split('moo cow +"farmer john" -"evil dog"')
['moo', 'cow', '+farmer john', '-evil dog']
 >>> shlex.split('moo cow +"farmer john" -"evil dog" +elephant')
['moo', 'cow', '+farmer john', '-evil dog', '+elephant']




More information about the Python-list mailing list