PEP 305 - CSV File API

Carlos Ribeiro cribeiro at mail.inet.com.br
Mon Feb 3 20:46:27 EST 2003


Skip and John,

skip> I will note that the csv module under development makes *no* 
skip> attempts at any kind of data conversion when reading CSV 
skip> files.  Even ints and floats are returned as strings.  

john> So the functionality for handling different data formats for 
john> dates (and any other data-types) -- in fact any functionality 
john> that knows or even suspects what the data type might be 
john> -- should be factored out into another layer.

Implemented this way, the CSV library is not nearly as useful as it could be. 
In the end, the library that is located at 'upper layer' will end up being 
regarded as the 'real' CSV library. Let me point out a very simple situation 
that happens literally daily to me in my 'real life'.

I run a small food processing company, and I do some data analysis daily using 
Excel. Most of the data that I analyze is read from a database (using ADO) 
and pre-processed using a bunch of Python scripts; these scripts just export 
CSV files that are read into Excel later [1][2]. 

The problem is, almost all my intermediate files have both 'date' and 'float' 
columns. This is highly common in business, specially if you are looking at 
sales figures and stuff like that.

To compound my problem, Python writes floats with a period (.) as a decimal 
separator. However, my copy of Excel is configured for the brazilian locale, 
and it expects a comma (,) as the decimal separator.

Now for the real issue. If I convert my floats to strings *before* writing the 
CSV file, It will end up quoted (for example, '3,1416') - assuming that the 
CSV library will work as Skip said. This is not what I would expect, and in 
fact, it's not what anyone working with different locale settings would say.

Last, even if Python just wrote floats with the 'right' decimal separator - 
comma, in my case - there still would be other software packages that would 
expect to get periods. Or worse, I could try to send my data files to people 
in other countries that would be unable to read it. In any event, there is no 
automatic solution, but the ability to quickly adjust the CSV library to get 
the correct behavior would be highly useful.


Carlos Ribeiro

---
[1] I know I could control Excel with COM or even ADO, but writing CSV files 
is simple; also, the intermediate files are useful for both debugging and 
backup purposes.

[2] Better still, some people may ask me why I'm using Excel, and not doing 
everything in Pure Python. <sigh>. No comments.





More information about the Python-list mailing list