csv.DictReader line skipping should be considered a bug?

Tue Dec 5 13:39:51 EST 2017

On 2017-12-05, Jason <jasonhihn at gmail.com> wrote:
> I ran into this:
> https://stackoverflow.com/questions/27707581/why-does-csv-dictreader-skip-empty-lines
>
> # unlike the basic reader, we prefer not to return blanks,
> # because we will typically wind up with a dict full of None
> # values
>
> while iterating over two files, which are line-by-line corresponding. The DictReader skipped ahead many lines breaking the line-by-line correspondence. 
>
> And I want to argue that the difference of behavior should be considered a bug. It should be considered as such because:
> 1. I need to know what's in the file to know what class to use. The file content should not break at-least-1-record-per-line. There may me multiple lines per record in the case of embedded new lines, but it should never no record per line. 
> 2.  It's a premature optimization. If skipping blank lines is desirable, then have another class on top of DictReader, maybe call it EmptyLineSkippingDictReader. 
> 3. The intent of DictReader is to return a dict, nothing more, therefore the change of behavior isn inappropriate. 
>
> Does anyone agree, or am I crazy?

I've used csv.DictReader for years and never come across this
oddity. Very interesting!

I am with you. Silently discarding blank records hides
information--the current design is unusable if blank records are
of interest. Moreover, what's wrong with a dict full of None, if
that's what's in the record? Haw many Nones are too many?

-- 
Neil Cerutti