split string at commas respecting quotes when string not in csv format
Tim Chase
python.list at tim.thechases.com
Thu Mar 26 16:30:18 EDT 2009
> The challenge is to turn a string like this:
>
> a=1,b="0234,)#($)@", k="7"
>
> into this:
>
> [("a", "1"), ("b", "0234,)#($)#"), ("k", "7")]
A couple solutions "work" for various pathological cases of input
data:
import re
s = 'a=1,b="0234,)#($)@", k="7"'
r = re.compile(r"""
(?P<varname>\w+)
\s*=\s*(?:
"(?P<quoted>[^"]*)"
|
(?P<unquoted>[^,]+)
)
""", re.VERBOSE)
results = [
(m.group('varname'),
m.group('quoted') or
m.group('unquoted')
)
for m in r.finditer(s)
]
############### or ##############################
r = re.compile(r"""
(\w+)
\s*=\s*(
"(?:[^"]*)"
|
[^,]+
)
""", re.VERBOSE)
results = [
(m.group(1), m.group(2).strip('"'))
for m in r.finditer(s)
]
Things like internal quoting ('b="123\"456", c="123""456"') would
require a slightly smarter parser.
-tkc
More information about the Python-list
mailing list