* separated values

Cliff Wells logiplexsoftware at earthlink.net
Tue Jan 15 16:32:45 EST 2002


On Tue, 15 Jan 2002 21:16:29 +0000 (UTC)
Magnus Lie Hetland wrote:

[snip]

> I'm not sure which version one should use, though... I *do* believe it
> is possible to formulate a definition of CSV which satisfies all
> available versions, and which can be called "standard". The question
> is which ones of the available modules support it, in addition to
> performance, of course. Dave Cole's version seems to have high
> performance, but I'm a bit worried that it doesn't seem to allow line
> breaks inside quoted fields (which I believe some programs may
> produce). Also, its insistence on having a field separator directly
> after a closing quote may be correct, but perhaps allowing whitespace
> before and after the separator had been more flexible?

[snip]

The line breaks inside quoted fields is an absolute requirement.  I would
guess that a huge number of CSV files are generated by MS Excel or Access
and they will put newlines inside quotes.  I doubt that allowing spaces
between the quote and the separator is a good idea because then it becomes
somewhat ambiguous whether the space should be included as part of the data
or if it should be ignored.  At the very least, it causes a differentiation
between how to handle quoted data versus unquoted data (spaces allowed
around quoted data, not allowed around unquoted data).

BTW, I pretty much agree with everything else I snipped.

-- 
Cliff Wells
Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308
(800) 735-0555 x308




More information about the Python-list mailing list