parsing CSV files with quotes

Tom Culliton culliton at clark.net
Thu Mar 30 16:33:44 EST 2000


I've had good luck recently with splitting on a seperator and then
pasting back together any pieces which have "unbalanced" quotes.  It's
not the fastest solution but the code was simple and it worked.  My
quoting rules were slightly different than yours (Plan9 rc style),
which let me use string.count to detect unbalanced quotes, but the
principle should still apply, possibly using re.findall.

In article <BDLE4.1642$HG1.47883 at nnrp1.uunet.ca>,
Warren Postma <embed at geocities.com> wrote:
>Suppose I have a CSV file where line 1 is the column names, and lines 2..n
>are comma separated variables, where all String fields are quoted like this:
>
>ID, NAME, AGE
>1, "Postma, Warren", 30
>2, "Twain, Shania",  31
>3, "Nelson, Willy",  57
>4, "Austin, \"Stone Cold\" Steve", 34
>
>So, the obvious thing I tried is:
>
>import string
>>>> print string.splitfields("4, \"Austin, \\\"Stone Cold\\\" Steve,
>34",",")
>['4', ' "Austin', ' \\"Stone Cold\\" Steve', ' 34']
>
>Hmm. Interesting. So I tried this:



More information about the Python-list mailing list