[Csv] Module question...

Skip Montanaro skip at pobox.com
Thu Jan 30 13:37:46 CET 2003


    >> One other possibility would be for the parser to only deal with one
    >> row at a time, leaving it up to the user code to feed the parser the
    >> row strings. But given the various possible line endings for a row of
    >> data and the fact that a column of a row may contain a line ending,
    >> not to mention all the other escape character issues we've discussed,
    >> this would be error-prone.

    Andrew> This is the way the Object Craft module has worked - it works
    Andrew> well enough, and the universal end-of-line stuff in 2.3 makes it
    Andrew> more seamless. Not saying I'm wedded to this scheme, but I'd
    Andrew> just like to have clear why we've chosen one over the other.

You have to be careful.  I think the Universal eol stuff might bit you in
the arse here.  Recall that in Excel, the default line terminator (record
separator?) is CRLF, but that a hard return within a cell is simply LF.  I
don't know what Universal eol handling will do with that.  In any case,
because you have to have full control over line termination, I think you
have to start dealing just with binary files.

    Andrew> I'm trying to think of an example where operating on a file-like
    Andrew> object would be too restricting, and I can't - oh, here's one:
    Andrew> what if you wanted to do some pre-processing on the data (say it
    Andrew> was uuencoded)?

Then you force the user to uudecode the file and stuff it into a StringIO
object. ;-)

    Andrew> Should the object just be defined as an iteratable, 

I had envisioned that the object the csv.reader() factory function (or
class) returned would be an iterable and that the object the csv.writer()
factory function (or class) returned would accept an iterable.

    Andrew> closing, etc, up to the user of the module? One downside of this
    Andrew> is you can't rewind an iterator, so things like the sniffer
    Andrew> would be SOL. We can't ensure that the passed file is rewindable
    Andrew> either. Hmmm.

The sniffer is going to be in a csvutils module, correct?  It could
certainly have either accept a filename or a string containing some subset
of the rows in the file to be sniffed.  I see no reason to constrain it to
the csv.reader()'s interface.

Skip


More information about the Csv mailing list