Trying to fix Invalid CSV File

Emile van Sebille emile at fenx.com
Mon Aug 4 01:38:22 EDT 2008


Ryan Rosario wrote:
> I have a very large CSV file that contains double quoted fields (since
> they contain commas). Unfortunately, some of these fields also contain
> other double quotes and I made the painful mistake of forgetting to
> escape or double the quotes inside the field:
> 
> 123,"Here is some, text "and some quoted text" where the quotes should
> have been doubled",321
> 


rec = '''123,"Here is some, text "and some quoted text" where the quotes 
should have been doubled",321'''

import csv

csv.reader([rec.replace(',"',',"""')
                .replace('",','""",')
                .replace('"""',"'''")
                .replace('"','""')
                .replace("'''",'"')]).next()

['123', 'Here is some, text "and some quoted text" where the quotes 
should have been doubled', '321']

:))

Emile


> Has anyone dealt with this problem before? Any ideas of an algorithm I
> can use for a Python script to create a new, repaired CSV file?
> 
> TIA,
> Ryan
> --
> http://mail.python.org/mailman/listinfo/python-list
> 




More information about the Python-list mailing list