[Csv] Re: First Cut at CSV PEP

Kevin Altis altis at semi-retired.com
Thu Jan 30 04:01:19 CET 2003


> From: Skip Montanaro
>
>     Dave> The Python core already has a convenience function for doing the
>     Dave> necessary conversion; PyObject_Str().
>
> This smacks of implicit type conversions to me, which has been the bane of
> my interaction with Perl (via XML-RPC).  I still think we have no business
> writing anything but strings, Unicode strings (encoded by codecs.open()),
> ints and floats to CSV files.  Exceptions should be raised for anything
> else, even None.  An empty field is "".
>
>     Dave> If we are in a hurry we could document the existing low level
>     Dave> writer behaviour which is to invoke PyObject_Str() for all
>     Dave> non-string values except None.  None is translated to ''.
>
> I really still dislike this whole None thing.  Whose use case is that
> anyway?

I think I brought up None. There was some initial confusion because Cliff's
DSV exporter was doing the wrong thing. My feeling is that if you have a
list

[5, 'Bob', None, 1.1]

as a csv with the Excel dialect that becomes

5,Bob,,1.1

Are you saying that you want to throw an exception instead? Booleans may
also present a problem. I was mostly thinking in terms of importing and
exporting data from embedded databases like MetaKit, my own list of
dictionaries (flatfile stuff), PySQLite, Gadfly. Anyway, the implication
might be that it is necessary for the user to sanitize data as part of the
export operation too. Have to ponder that.

Regardless, we have to be careful to not make this too complicated or it
will be worse than nothing.

Quotes aren't going to get used in the case above unless you've specified to
always use them (overridden part of the Excel dialect), because no field
contains the comma separator character. Now that I look at this again the
Access export dialog I sent in an earlier email shows that the default
Access csv is actually a separate dialect because they specifically call out
the "Text qualifier" while numbers, empty fields (probably NULLS in SQL?)
will not have quotes, only text fields will.

To further complicate things I'm now wondering what happens with numbers in
a Europe ore elsewhere where the comma is used instead of a decimal point so
1.1 is 1,1 or does that not actually occur and I'm remembering some
localization issues incorrectly?

Reading in

5,Bob,,1.1

becomes

['5', 'Bob', '', '1.1']

because we said we weren't going to do further processing, the user code
should do further conversions as part of the iteration.

I'm way behind on reading all the emails. I got bogged down in a bunch of
Mac OS X testing... I'll try and dig through them a little tomorrow and
Friday.

If we put together the unittest test cases first then our input, output, and
expected results for processing would be clear for a given dialect.

ka

_______________________________________________
Csv mailing list
Csv at mail.mojam.com
http://manatee.mojam.com/mailman/listinfo/csv



More information about the Csv mailing list