[Python-ideas] csv.DictReader could handle headers more intelligently.

Steven D'Aprano steve at pearwood.info
Thu Jan 24 01:19:34 CET 2013


On 24/01/13 05:20, Bruce Leban wrote:

> I realize that sometimes getting a single value and sometimes an array is
> potentially messy, but bear in mind that in most cases the reader of the
> csv file has some idea of what they are reading. There could be an optional
> parameter multivalue="A" that lists the columns that are allowed to have
> multiple values and if not present it raises an exception. To allow any
> column to be multivalued, you could use multivalue=True.

-1 to adding optional parameters that change the behaviour of a class.

To deal with cases where you expect multiple columns with the same name,
add a new reader class that treats all columns to be multi-valued. The
standard DictReader class should continue to behave like a dict.

Don't over-engineer this MultiDictReader -- it should stay simple and treat
all column names as potentially multivalued. If the caller has some
requirements for which names can have how many columns -- "there should be
exactly three columns named X, and only one Y, and at least four Z" -- they
can check the result and decide for themselves if there is a problem.


> As to skipping over a leading blank line, this happened to me just
> yesterday. I was saving some data in csv files and all the files ended up
> with an extra blank line at the top. I'd be +1 for skipping over a blank
> line at the top, +0 for skipping over more than one blank line.


I don't see any reason not to skip blank lines at the top of the file.



-- 
Steven



More information about the Python-ideas mailing list