Parsing Search Criteria

Greg Krohn ('rot-13') 'tert at pncen.hf'.decode
Thu Mar 6 13:48:44 EST 2003


I want to parse a criteria string, counting words enclosed in quotes as one
'word'. Like this:

>>> parse_criteria('parrot "I like Python." ternary operator trouble')
['I like Python.', 'parrot', 'ternary', 'operator', 'trouble']

I haven't found anything Googling, so I wrote the following. So I have two
questions.
One: Do any modules exsist that do this better? Two: If not One, do you see
any potential
pifalls with my code? Could it be more elegant? (Pretend that's one
question.)

Thanks for any help.



def parse_criteria(criteria):
    """Parse a search criteria string in to a list of words."""

    quotes = [] #indices of quotes
    quotepos = -1
    words = [] #all words found counting a fragment in quotes as one word

    #find all the quotes
    while 1:
        quotepos = criteria.find('"', quotepos + 1)
        if quotepos == -1:
            break
        else:
            quotes.append(quotepos)

    #remove an unmatched quote from the end
    if len(quotes) % 2:
        del quotes[-1]

    #add the quoted fragments to the word list
    #Note: the list comp. creates a list of (start, end) tuples
    for sq, eq in [(x, x+1) for x in range(0, len(quotes), 2)]:
        frag = criteria[quotes[sq]+1:quotes[eq]]
        frag = frag.replace('"', '')
        words.append(frag)

    #remove the quoted frags from the criteria
    for word in words:
        criteria = criteria.replace('"%s"' % word, ' ')

    #remove empty words
    for word in criteria.split():
        word = word.strip()
        if word:
            words.append(word)

    return words

--
Greg Krohn
'tert at pncen.hf'.decode('rot-13')






More information about the Python-list mailing list