[Python-ideas] csv.DictReader could handle headers more intelligently.
alex23
wuwei23 at gmail.com
Wed Jan 23 02:51:38 CET 2013
On Jan 23, 11:06 am, "J. Cliff Dyer" <j... at sdf.lonestar.org> wrote:
> I'm working with some poorly-formed CSV files, and I noticed that
> DictReader always and only pulls headers off of the first row. But many
> of the files I see have blank lines before the row of headers, sometimes
> with commas to the appropriate field count, sometimes without. The
> current implementation's behavior in this case is likely never correct,
> and certainly always annoying.
I don't think we should start adding support for every malformed type
of csv file that exists. It's easy enough to remove the unnecessary
lines yourself before passing them to DictReader:
from csv import DictReader
with open('malformed.csv','rb') as csvfile:
csvlines = list(l for l in csvfile if l.strip())
csvreader = DictReader(csvlines)
Personally, if I was dealing with this as often as you are, I'd
probably make a custom context manager instead. The problem lies in
the files themselves, not in csv's response to them.
More information about the Python-ideas
mailing list