[Python-Dev] CSV, bytes and encodings

R. David Murray rdmurray at bitdance.com
Thu Apr 2 00:22:26 CEST 2009


On Wed, 1 Apr 2009 at 10:53, Antoine Pitrou wrote:
> Perhaps. But without using 'rU' the file couldn't be read at all.
> (I'm not sure it was Windows line endings by the way; perhaps Macintosh ones;
> anyway, it didn't work using 'rb')

I just tested it in 2.6.  It must have been old-mac (\r), which indeed
gave me the error message you mentioned.  Windows lineneds worked fine
for me reading in binary mode on linux.

> I have to add that if individual fields really can contain newlines, then the
> CSV module ought to be smarter when /saving/ those fields. I've inadvertently
> tried to produce a CSV file with such fields and it ended up wrong when opened
> as a spreadsheet (text after the newlines was ignored in Gnumeric and in
> OpenOffice, while Excel displayed a spurious additional row containing only the
> text after the newline).

I just added some tests to trunk that seem to indicate this case is
handled correctly in terms of preserving the data.  Maybe you didn't
write the file such that the fields with the newlines were quoted?
And of course how non-Excel applications handle that data on import
can be different from how Excel handles it.

--David


More information about the Python-Dev mailing list