Is anyone happy with csv module?

John Machin sjmachin at lexicon.net
Wed Dec 12 15:34:05 EST 2007


On Dec 13, 12:58 am, Neil Cerutti <horp... at yahoo.com> wrote:
> On 2007-12-12, John Machin <sjmac... at lexicon.net> wrote:
>
> >> It's clear that I am thinking to completely different usages
> >> for CSV than what most people in this thread. I use csv to
> >> export and import numerical data columns to and from
> >> spreadsheets.
>
> > For that purpose, CSV files are the utter pox and then some.
> > Consider using xlrd and xlwt (nee pyexcelerator) to read (resp.
> > write) XLS files directly.
>
> I can vouch for that advice. I was exporting .xls files to csv
> text files for over a year before I tried the xlrd solution--the
> whole process is less cumbersome now, though it was bewildering
> at first working with Excel in Python. Actually, surprises still
> crop up now and then, mostly to do with cell types.

Hi Neil, I'd be interested in hearing from you what caused the initial
bewilderment with xlrd, and could it have been reduced by better
documentation? What kinds of surprises?

> The advantage
> of working with csv was that everything was a string.

It depends of your point of view. I'd regard that as a
DISadvantage :-) With xlrd, if you have no expectation about the type
of data in a cell, but need/want to know, xlrd will tell you. If you
do have an expectation, you can check if actual == expected.

Here's an example. Create a tiny csv file with dates in a format
that's NOT appropriate to your locale (e.g. if you are in the USA,
like the ddmmyyyy_dates.csv below). Open it with Excel by double-
clicking on the name in Windows Explorer. Make the column twice its
initial width. You'll notice some of the data (about 60% in a large
dataset with approx. uniform distribution e.g. birth-dates) is LEFT-
justified (text) and the remainder is RIGHT-justified (date). If you
were given the xls file in that state, using xlrd the problem could be
detected and worked around. Alternatively, go into user emulation
mode: "fix" the problem with some formulas, forget immediately what
you did, save the result as a new csv file, pass that on to the next
user without a murmur, and delete the original csv file and the xls
file.

This is based on a true story, with one difference: the dates in the
original csv file were formatted correctly for the DD/MM/YYYY-using
locale; however Excel 97 would ignore the locale and assume MM/DD/YYYY
if you opened the file from within Excel instead of double-clicking in
Explorer (or vice versa; I forget which).

8<== ddmmyyyy_dates.csv
01/01/2007
31/01/2007
01/12/2007
31/12/2007
8<== mmddyyyy_dates.csv
01/01/2007
01/31/2007
12/01/2007
12/31/2007
8<==

Cheers,
John




More information about the Python-list mailing list