[Python-Dev] CSV, bytes and encodings

R. David Murray rdmurray at bitdance.com
Wed Apr 1 16:54:19 CEST 2009


On Wed, 1 Apr 2009 at 05:37, skip at pobox.com wrote:
> This case arises rarely, but it does turn up every now and again.  If you

For some definition of "rarely".  I don't handle CVS files generated by
Windows very often, but I've run into it a least a couple times.  That
says to me that it isn't all that rare in the wild.  (One out of
fifty?  But I'm sure it depends on your data sources; some people
will run into it often, others almost never.)

Of course, on unix it doesn't help much having those newlines preserved,
since there are few tools on unix other than the CSV module that even
attempt to deal with newlines inside quoted strings being data, but on
Windows it makes a difference.

It would actually be nice if the CSV module had an option for turning
those quoted newlines into spaces, but that's a feature request and
is out of scope for this discussion :)

>    Antoine> The documentation is, IMO, wrong even in 2.x. Just yesterday I
>    Antoine> had to open a CSV file in 'rU' mode because it had Windows line
>    Antoine> endings and I'm under Linux....

That sounds like a bug, IMO.  From the source code it looks like the
2.6 _csv module should be handling that, and certainly intended to
handle it.

--David


More information about the Python-Dev mailing list