split string at commas respecting quotes when string not in csv format

Paul McGuire ptmcg at austin.rr.com
Thu Mar 26 17:22:39 EDT 2009


On Mar 26, 2:51 pm, "R. David Murray" <rdmur... at bitdance.com> wrote:
> OK, I've got a little problem that I'd like to ask the assembled minds
> for help with.  I can write code to parse this, but I'm thinking it may
> be possible to do it with regexes.  My regex foo isn't that good, so if
> anyone is willing to help (or offer an alternate parsing suggestion)
> I would be greatful.  (This has to be stdlib only, by the way, I
> can't introduce any new modules into the application so pyparsing is
> not an option.)
>
> The challenge is to turn a string like this:
>
>     a=1,b="0234,)#($)@", k="7"
>
> into this:
>
>     [("a", "1"), ("b", "0234,)#($)#"), ("k", "7")]
>
> --
> R. David Murray            http://www.bitdance.com

If you must cram all your code into a single source file, then
pyparsing would be problematic.  But pyparsing's installation
footprint is really quite small, just a single Python source file.  So
if your program spans more than one file, just add pyparsing.py into
the local directory along with everything else.

Then you could write this little parser and be done (note the
differentiation between 1 and "7"):

test = 'a=1,b="0234,)#($)@", k="7"'

from pyparsing import Suppress, Word, alphas, alphanums, \
    nums, quotedString, removeQuotes, Group, delimitedList

EQ = Suppress('=')
varname = Word(alphas,alphanums)
integer = Word(nums).setParseAction(lambda t:int(t[0]))
varvalue = integer | quotedString.setParseAction(removeQuotes)
var_assignment = varname("name") + EQ + varvalue("rhs")
expr = delimitedList(Group(var_assignment))

results = expr.parseString(test)
print results.asList()
for assignment in results:
    print assignment.name, '<-', repr(assignment.rhs)

Prints:

[['a', 1], ['b', '0234,)#($)@'], ['k', '7']]
a <- 1
b <- '0234,)#($)@'
k <- '7'

-- Paul



More information about the Python-list mailing list