Trying to fix Invalid CSV File

Emile van Sebille emile at fenx.com
Mon Aug 4 11:30:08 EDT 2008


John Machin wrote:
> On Aug 4, 6:15 pm, Ryan Rosario <uclamath... at gmail.com> wrote:
>> On Aug 4, 1:01 am, John Machin <sjmac... at lexicon.net> wrote:
>>
>>> On Aug 4, 5:49 pm, Ryan Rosario <uclamath... at gmail.com> wrote:
>>>> Thanks Emile! Works almost perfectly, but is there some way I can
>>>> adapt this to quote fields that contain a comma in them?
<snip>

> Emile's snippet is pushing it through the csv reading process, to
> demonstrate that his series of replaces works (on your *sole* example,
> at least). 

Exactly -- just print out the results of the passed argument:

 >>> 
rec.replace(',"',",'''").replace('",',"''',").replace('"','""').replace("'''",'"')

'123,"Here is some, text ""and some quoted text"" where the quotes 
should have been doubled",321'

Where it won't work is if any of the field embedded quotes are next to 
commas.

I'd run it against the file.  Presumably, you've got a consistent field 
count expectation per record.  Any resulting record not matching is 
suspect and will identify records this approach won't address.

There's probably better ways, but sometimes it's fun to create 
executable line noise.  :)

Emile




More information about the Python-list mailing list