* separated values

Magnus Lie Hetland mlh at vier.idi.ntnu.no
Tue Jan 15 17:22:56 EST 2002


In article <...>, Cliff Wells wrote:
[snip]
>The line breaks inside quoted fields is an absolute requirement.  I would
>guess that a huge number of CSV files are generated by MS Excel or Access
>and they will put newlines inside quotes.  I doubt that allowing spaces
>between the quote and the separator is a good idea because then it becomes
>somewhat ambiguous whether the space should be included as part of the data
>or if it should be ignored.  At the very least, it causes a differentiation
>between how to handle quoted data versus unquoted data (spaces allowed
>around quoted data, not allowed around unquoted data).

Absolutely. What I reacted to was the following statement (from the
web page):

  The parser will raise a csv.Error exception under any of the
  following circumstances: 

    * If the closing " on a quoted field is not immediately followed
      by either end of line, or a field separator. 

    * If an end of line is encountered which is not at the end
      of string 

It is obvious that we agree that the second is unreasonable, and it
seems you may agree that the first is unreasonable too? (I'm only
talking about allowing space outside quoted fields here.)

OTOH: Should one simply disallow all space surrounding an unquoted
value? I guess any reasonable package generating a value surrounded by
space (which should be preserved) would quote that field (including
the space)...

Oh, well. Whatever the standard, it should be clearly spelled out in
the docs, and preferrably allow a lot of switches/flags to be supplied
so one can get the behaviour one needs...

>BTW, I pretty much agree with everything else I snipped.

--
Magnus Lie Hetland                                  The Anygui Project
http://hetland.org                                  http://anygui.org



More information about the Python-list mailing list