split CSV fields

John Machin sjmachin at lexicon.net
Thu Nov 16 06:11:46 EST 2006


John Machin wrote:
> Fredrik Lundh wrote:
> > robert wrote:
> >
> > > What is a most simple expression for splitting a CSV line
> >  > with "-protected fields?
> > >
> > > s='"123","a,b,\"c\"",5.640'
> >
> > import csv
> >
> > the preferred way is to read the file using that module.  if you insist
> > on processing a single line, you can do
> >
> >      cols = list(csv.reader([string]))
> >
> > </F>
>
> Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
> (Intel)] on win32
> | >>> import csv
> | >>> s='"123","a,b,\"c\"",5.640'
> | >>> cols = list(csv.reader([s]))
> | >>> cols
> [['123', 'a,b,c""', '5.640']]
> # maybe we need a bit more:
> | >>> cols = list(csv.reader([s]))[0]
> | >>> cols
> ['123', 'a,b,c""', '5.640']
>
> I'd guess that the OP is expecting 'a,b,"c"' for the second field.
>
> Twiddling with the knobs doesn't appear to help:
>
> | >>> list(csv.reader([s], escapechar='\\'))[0]
> ['123', 'a,b,c""', '5.640']
> | >>> list(csv.reader([s], escapechar='\\', doublequote=False))[0]
> ['123', 'a,b,c""', '5.640']
>
> Looks like a bug to me; AFAICT from the docs, the last attempt should
> have worked.

Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>> csv.excel.doublequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True. 
"""
Peter's example should not have worked.




More information about the Python-list mailing list