csv Parser Question - Handling of Double Quotes

jwbrown77 at gmail.com jwbrown77 at gmail.com
Thu Mar 27 17:40:58 EDT 2008


On Mar 27, 1:53 pm, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Thu, 27 Mar 2008 17:37:33 -0300, Aaron Watters  
> <aaron.watt... at gmail.com> escribió:
>
>
>
> >> "this";"is";"a";"test"
>
> >> Resulting in an output of:
>
> >> ['this', 'is', 'a', 'test']
>
> >> However, if I modify the csv to:
>
> >> "t"h"is";"is";"a";"test"
>
> >> The output changes to:
>
> >> ['th"is"', 'is', 'a', 'test']
>
> > I'd be tempted to say that this is a bug,
> > except that I think the definition of "csv" is
> > informal, so the "bug/feature" distinction
> > cannot be exactly defined, unless I'm mistaken.
>
> AFAIK, the csv module tries to mimic Excel behavior as close as possible.  
> It has some test cases that look horrible, but that's what Excel does...  
> I'd try actually using Excel to see what happens.
> Perhaps the behavior could be more configurable, like the codecs are.
>
> --
> Gabriel Genellina

Thank you Aaron and Gabriel.  I was also hesitant to use the term
"bug" since as you said CSV isn't a standard.  Yet in the same right I
couldn't readily think of an instance where the quote should be
removed if it's not sitting right next to the delimiter (or at the
very beginning/end of the line).

I'm not even sure if it should be patched since there could be cases
where this is how people want it to behave and I wouldn't want their
code to break.

I think rolling out a custom class seems like the only solution but if
anyone else has any other advice I'd like to hear it.

Thanks again for the help.



More information about the Python-list mailing list