PEP 305 - CSV File API

Andrew Dalke adalke at mindspring.com
Fri Jan 31 19:17:48 EST 2003


Skip Montanaro wrote:
> A new PEP (305), "CSV File API", is available for reader feedback.  This PEP
> describes an API and implementation for reading and writing CSV files.

  - It's titled "Comma Separated Values" but handles separaters other
than comma.  I liked DSV for the more generic name.  I suppose the "C"
could also stand for "Character".  In any case, mention early that
other delimiters are also allowed.

The ones I use most often are, in order, '\t', ' ', whitespace (\s+),
','.  Yep, comma is pretty far down my list.


   - There isn't anything which descibes how to interface with the
'row' returned from iterating over a reader, or what should be sent
to the writer.  At least not that I could find.  I assume it's a list.



   - I've had success with an API like this

for row in TabFileWithHeader(open("input.txt")):
   print row["pH"]

That is, the first line is a header description, and the row object
returned is both indexable as a list and accessible by name, as
shown.

I think the same can be built as a layer on top of this PEP,

for row in HeaderReader(csv.reader(file("input.txt"))):
    print row["pH"]


  ... Ahh, that's your comment "What about an option to
generate list-of-dict output".  Another solution is my MultiKeyDict
package, which keeps the values in sorted order.


   - I prefer 'append' over 'write'

Consider a copy.  Under the current scheme

   def copy(input, output):
     for row in input:
       output.write(row)

This allows the input to be a list or a csv.reader or any other
iterable objects.  However, output objects must implement the
'write' method, which for other cases is something which takes
a string, not something which takes an object.

OTOH, consider

   def copy(input, output):
     for row in input:
       output.append(row)

This lets me pass in a list as the output -- very handy when
I decide I want to display the output to, say, the screen instead
of save to a file.  I don't have to implement a wrapper object
like

class _adapter:
   def __init__(self, output):
     self.output = output
   def write(self, x):
     self.output.append(x)

And, people expect that 'append' takes an object, not a string.

					Andrew
					dalke at dalkescientific.com





More information about the Python-list mailing list