[issue30825] csv.Sniffer does not detect lineterminator

Neil Schemenauer report at bugs.python.org
Thu Oct 4 18:33:40 EDT 2018


Neil Schemenauer <nas-python at arctrix.com> added the comment:

There is another issue related to this.  If you use codecs to get a reader, it uses str.splitlines() internally, which treats a bunch of different characters as line terminators.  See issue #18291 and:

https://docs.python.org/3.8/library/stdtypes.html#str.splitlines

I was thinking about different ways to fix this.  First, the csv module suggests you pass newline='' to the file object.  I suspect most people don't know to do that.  So, I thought maybe the csv module should inspect the file object that gets passed in and then warn if newline='' has not been used or if the file is a codecs reader object.

However, that seems fairly complicated.  Would it be better if we changed the 'csv' module to do its own line splitting?  I think that would be better although I'm not sure about backwards compatibly.  Currently, the reader expects to call iter() on the input file.  Would it be okay if it used the 'read' method of it in preference to using iter()?  It could still fallback to iter() if there was no read method.

----------
nosy: +nascheme

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue30825>
_______________________________________


More information about the Python-bugs-list mailing list