Suggestions for workaround in CSV bug

Stephen Simmons mail at stevesimmons.com
Tue Jan 24 10:08:19 EST 2006


Simmons, Stephen wrote:

> > 
> > I've come across a bug in CSV where the csv.reader() raises an 
> > exception if the input line contains '\r'. Example code and output
> > below shows a test case where csv.reader() cannot read an array
> > written by csv.writer(). 
> > 
> > Error: newline inside string
> > WARNING: Failure executing file: <csv_error.py>
>   
Michael Stroeder suggested:
 > Did you play with the csv.Dialect setting lineterminator='\n' ?

This didn't make any difference.

I found a bug on SourceForge referring to this. It's bug #967934 "csv module cannot handle embedded \r". The description says "CSV module cannot handle the case of embedded \r (i.e. carriage return) in a field. As far as I can see, this is hard-coded into the _csv.c file and cannot be fixed with Dialect changes."

However I found a workaround by opening the file in universal newline mode. As shown below, this  stops csv from raising an exception. It's still a bug, though, as the programmer needs to know to open the file with mode 'rUb' when it was originally created as 'rb'.

Is this fixed for Python 2.5?

Cheers

Stephen

#-----------------------------------------
import csv

s = [ ['a'], ['\r'], ['b'] ]
name = 'c://temp//test2.csv'

print 'Writing CSV file containing %s' % repr(s)
f = file(name, 'wb')
csv.writer(f).writerows(s)
f.close()

print 'CSV file is %s' % repr(file(name, 'rb').read())

print 'Now reading back as CSV...'
# This give a _csv.Error exception when csv.reader() encounters the \r: 
# f = file(name, 'rb')
# But adding the universal newline format U makes everything OK again:
f = file(name, 'rUb')
for r in csv.reader(f):
    print 'Read row containing %s' % repr(r)

# Output is:
# Writing CSV file containing [['a'], ['\r'], ['b']]
# CSV file is 'a\r\n"\r"\r\nb\r\n'
# Now reading back as CSV...
# Read row containing ['a']
# Read row containing ['\n']
# Read row containing ['b']




More information about the Python-list mailing list