First Cut at CSV PEP
Cliff Wells
LogiplexSoftware at earthlink.net
Tue Jan 28 22:17:32 CET 2003
On Mon, 2003-01-27 at 20:56, Dave Cole wrote:
> I only have one issue with the PEP as it stands. It is still aiming
> too low. One of the things that we support in our parser is the
> ability to handle CSV without quote characters.
>
> field1,field2,field3\, field3,field4
>
> One of our customers has data like the above. To handle this we would
> need something like the following:
>
> # Use the 'raw' dialect to get access to all tweakables.
> writer(fileobj,
> dialect='raw', quotechar=None, delimiter=',', escapechar='\\')
+1 on escapechar, -1 on 'raw' dialect.
Why would a 'raw' dialect be needed? It isn't clear to me why
escapechar would be mutually exclusive with any particular dialect.
Further, not specifying a dialect (dialect=None) should be the default
which would seem the same as 'raw'.
> I think that we need some way to handle a potentially different set of
> options on each dialect.
I'm not understanding how this is different from Skip's suggestion to
use
reader(fileobj, dialect="excel2000", delimiter='\t')
Or are you suggesting that not all options would be available on all
dialects? Can you suggest an example?
> When you CSV export from Excel, do you have the ability to use a
> delimiter other than comma? Do you have the ability to change the
> quotechar?
I think it is an option to save as a TSV file (IIRC), which is the same
as a CSV file, but with tabs.
> Should the wrapper protect you from yourself so that when you select
> the Excel dialect you are limited to the options available within
> Excel?
No. I think this would be unnecessarily limiting.
> Maybe the dialect should not limit you, it should just provide the
> correct defaults.
This is what I'm thinking.
> Since we are going to have one parsing engine in an extension module
> below the Python layer, we are probably going to evolve more tweakable
> settings in the parser over time. It would be nice if we could hide
> new tweakables from application code by associating defaults values
> with dialect names in the Python layer. We should not be exposing the
> low level parser interface to user code if it can be avoided.
+1
--
Cliff Wells, Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308 (800) 735-0555 x308
More information about the Csv
mailing list