Troubles with CSV file

Skip Montanaro skip at pobox.com
Fri May 14 09:50:38 EDT 2004


    Vladimir> I have a big CSV file, which I must read and do some
    Vladimir> processing with it.  Unfortunately I can't figure out how to
    Vladimir> use standard *csv* module in my situation. The problem is that
    Vladimir> some records look like:

    Vladimir> ""read this, man"", 1

Invalid CSV...

    Vladimir> Quick experiment show me that *csv* module (with default
    Vladimir> 'excel' dialect) expects something like

    Vladimir>      """read this, man""", 1

Valid CSV...

Here's the rule.  If a field contains the separator character (a comma in
this case) or the quote character the field must be quoted and any embedded
quote characters must be doubled.  Your first example doesn't obey this
rule.  The second one does.

    Vladimir> Maybe some *alternative* CSV parsers can help?  Any
    Vladimir> suggestions are welcomed.

I suggest you either modify the program that wrote the CSV file in the first
place and regenerate your CSV file.  If that's not possible, perhaps you can
manually massage it (or write a smallish Python script to massage it) now
that you know how the file must be formatted.

Skip




More information about the Python-list mailing list