Quote-aware string splitting

Paul McGuire ptmcg at austin.rr.com
Tue Apr 26 06:02:07 EDT 2005


Quoted strings are surprisingly stateful, so that using a parser isn't
totally out of line.  Here is a pyparsing example with some added test
cases.  Pyparsing's quotedString built-in handles single or double
quotes (if you don't want to be this permissive, there are also
sglQuotedString and dblQuotedString to choose from), plus escaped quote
characters.

The snippet below includes two samples.  The first 3 lines give the
equivalent to other suggestions on this thread.  It is followed by a
slightly enhanced version that strips quotation marks from any quoted
entries.

-- Paul
(get pyparsing at http://pyparsing.sourceforge.net)
==========
from pyparsing import *
test = r'''spam 'it don\'t mean a thing' "the life of brian"
           42 'the meaning of "life"' grail'''
print OneOrMore( quotedString | Word(printables) ).parseString( test )

# strip quotes during parsing
def stripQuotes(s,l,toks):
    return toks[0][1:-1]
quotedString.setParseAction( stripQuotes )
print OneOrMore( quotedString | Word(printables) ).parseString( test )
==========

returns:
['spam', "'it don\\'t mean a thing'", '"the life of brian"', '42',
'\'the meaning of "life"\'', 'grail']
['spam', "it don\\'t mean a thing", 'the life of brian', '42', 'the
meaning of "life"', 'grail']




More information about the Python-list mailing list