Dealing with \r in CSV fields in Python2.4

Terry Reedy tjreedy at udel.edu
Wed Sep 4 17:15:10 EDT 2013


On 9/4/2013 11:04 AM, Tim Chase wrote:
> I've got some old 2.4 code (requires an external lib that hasn't been
> upgraded) that needs to process a CSV file where some of the values
> contain \r characters.  It appears that in more recent versions (just
> tested in 2.7; docs suggest this was changed in 2.5), Python does the
> Right Thing™ and just creates values in the row containing that \r.
> However, in 2.4, the csv module chokes on it with
>
>    _csv.Error: newline inside string
>
> as demoed by the example code at the bottom of this email.

While probably not necessary for this problem, one can use more that one 
Python version to solve a problem. For instance, You could use a current 
version to read the data and transform it so that it can be piped to 2.4 
code running in a subprocess.

>  What's the
> best way to deal with this?  At the moment, I'm just using something
> like
>
>    def unCR(f):
>      for line in f:
>        yield line.replace('\r', '')
>
>    f = file('input.csv', 'rb')
>    for row in csv.reader(unCR(f)):
>      code_to_process(row)
>
> but this throws away data that I'd really prefer to keep if possible.
>
> I know 2.4 isn't exactly popular, and in an ideal world, I'd just
> upgrade to a later 2.x version that does what I need.  Any old-time
> 2.4 pythonistas have sage advice for me?
>
> -tkc
>
>
> from cStringIO import StringIO
> import csv
> f = file('out.txt', 'wb')
> w = csv.writer(f)
> w.writerow(["One", "Two"])
> w.writerow(["First\rSecond", "Third"])
> f.close()
>
> f = file('out.txt', 'rb')
> r = csv.reader(f)
> for i, row in enumerate(r): # works in 2.7, fails in 2.4
>      print repr(row)
> f.close()
>
>


-- 
Terry Jan Reedy





More information about the Python-list mailing list