[Csv] Access Products sample

Cliff Wells LogiplexSoftware at earthlink.net
Thu Jan 30 21:10:10 CET 2003


On Wed, 2003-01-29 at 22:33, Kevin Altis wrote:
> I created a db and table in Access (products.mdb) using one of the built-in
> samples. I created two rows, one that is mostly empty. I used the default
> CSV export to create(Products.csv) and also output the table as an Excel
> 97/2000 XLS file (Products.xls). Finally, I had Excel export as CSV
> (ProductsExcel.csv). They are all contained in the attached zip.
> 
> The currency column in the table is actually written out with formatting
> ($5.66 instead of just 5.66). Note that when Excel exports this column it
> has a trailing space for some reason (,$5.66 ,).

So we've actually found an application that puts an extraneous space
around the data, and it's our primary target.  Figures.

> While exporting it reminded me that unless a column in the data set contains
> an embedded newline or carriage return it shouldn't matter whether the file
> is opened in binary mode for reading.
> 
> Without a schema we don't know what each column is supposed to contain, so
> that is outside the domain of the csv import parser and export writer.

Agreed.

> The values exported by both Access and Excel are designed to prevent
> information loss within the constraints of the CSV format, thus a field with
> no value (what I think of as None in Python) is empty in the CSV

Something just occurred to me:  say someone is controlling Excel via
win32com and obtains their data that way.  Do the empty cells in that
list appear as '' or None?  If they do appear as None, then I'd be
inclined to again raise the argument that we should map None => '' on
export.  Unless, of course, someone else has an idea they want to trade
+1 votes on again <wink>

> We should we be able to import and then export using a given dialect, such
> that there would be no differences between the original csv and the exported
> one? Actually, using the Access default of quoting strings it isn't possible
> to do that because it implies having a schema to know that a given column is
> a string. With the Excel csv format it is possible because a column that
> doesn't contain a comma won't be quoted.

I don't think that we need to worry about whether checksum(original) ==
checksum(output) to claim compatibility, only that we can read and write
files compatible with said application.  If they turn out to be
identical, that's just a side-effect ;)

-- 
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308  (800) 735-0555 x308



More information about the Csv mailing list