How to split a string containing nested commas-separated substrings

Paul McGuire ptmcg at austin.rr.com
Wed Jun 18 15:05:42 EDT 2008


On Jun 18, 12:19 pm, Robert Dodier <robert.dod... at gmail.com> wrote:
> Hello,
>
> I'd like to split a string by commas, but only at the "top level" so
> to speak. An element can be a comma-less substring, or a
> quoted string, or a substring which looks like a function call.
> If some element contains commas, I don't want to split it.
>
> Examples:
>
> 'foo, bar, baz' => 'foo' 'bar' 'baz'
> 'foo, "bar, baz", blurf' => 'foo' 'bar, baz' 'blurf'
> 'foo, bar(baz, blurf), mumble' => 'foo' 'bar(baz, blurf)' 'mumble'
>
> Can someone suggest a suitable regular expression or other
> method to split such strings?
>
> Thank you very much for your help.
>
> Robert

tests = """\
foo, bar, baz
foo, "bar, baz", blurf
foo, bar(baz, blurf), mumble""".splitlines()


from pyparsing import Word, alphas, alphanums, Optional, \
    Group, delimitedList, quotedString

ident = Word(alphas+"_",alphanums+"_")
func_call = Group(ident + "(" + Optional(Group(delimitedList(ident)))
+ ")")

listItem = func_call | ident | quotedString

for t in tests:
    print delimitedList(listItem).parseString(t).asList()


Prints:

['foo', 'bar', 'baz']
['foo', '"bar, baz"', 'blurf']
['foo', ['bar', '(', ['baz', 'blurf'], ')'], 'mumble']


-- Paul



More information about the Python-list mailing list