[Csv] trial zip/tar packages of csv module available
John Machin
sjmachin at lexicon.net
Thu Feb 13 23:16:51 CET 2003
On Thu, 13 Feb 2003 13:00:36 -0600, Skip Montanaro <skip at pobox.com> wrote:
>
> If you are interested in reading or writing CSV files from Python and you
> have Python 2.2 or 2.3 available, please take a moment to download,
> extract
> and install either or both of the following URLs:
>
> http://manatee.mojam.com/~skip/csv.tar.gz
> http://manatee.mojam.com/~skip/csv.zip
> The goal is to get this package into Python 2.3, though we've tried to
> keep
> it working under 2.2. It uses iterators, so I don't know if it will work
> with anything before 2.2. The package has been built on Linux and Mac OS
> X
> at this point. I think it's been built on Windows though I'm not
> positive.
> There shouldn't be anything terribly platform-dependent there.
>
Good news first, whinges at the end of the message :-)
===
Compiles & installs OK out-of-the-box with Python 2.2, Windows 2000, BCC32
(Borland 5.5 freebie command-line compiler) -- thanks to revision 1.30 :-)
===
C:\csv\test>python test_csv.py
*** skipping leakage tests ***
........................................................
----------------------------------------------------------------------
Ran 56 tests in 0.030s
OK
===
Slurped through a 150Mb CSV file at a reasonable speed without any memory
leak that could be detected by the primitive method of watching the Task
Manager memory graph.
===
Doco:
"""0.1.1 Module Contents
The csv module defines the following functions.
reader(iterable[, dialect=”excel” ] [, fmtparam])
Return a reader object which will iterate over lines in the given
csvfile."""
Huh? What "given csvfile"?
Need to define carefully what iterable.next() is expected to deliver; a
line, with or without a trailing newline? a string of 1 or more bytes which
may contain embedded line separators, either as true separators or as
(quoted) data? [e.g. iterable could be a generator which uses say
read(16384)]. I have noticed in the csv mailing list some muttering along
the lines of "the iterable's underlying file must have been opened in
binary mode"!? Que?
This might necessitate a FAQ entry:
>>> cr = csv.reader("iterable is string!")
>>> [x for x in cr]
[['i'], ['t'], ['e'], ['r'], ['a'], ['b'], ['l'], ['e'], [' '], ['i'],
['s'], [' '], ['s'], ['t'], ['r'], ['i'], ['n'], ['g'], ['!']
]
>>>
===
Does the reader detect any errors at all? E.g. I expected some complaint
here, instead of silently doing nothing:
>>> import csv
>>> cr = csv.reader(['f1,"unterminated quoted field,f3'])
>>> for x in cr: print x
...
>>> cr = csv.reader(['f1,"terminated quoted field",f3'])
>>> for x in cr: print x
...
['f1', 'terminated quoted field', 'f3']
>>> cr = csv.reader(['f1,"unterminated quoted field,f3\n'])
>>> for x in cr: print x
...
>>>
===
Judging by the fact that in _csv.c '\0' is passed around as a line-ending
signal, it's not 8-bit-clean. This fact should be at least documented, if
not fixed (which looks like a bit of a rewrite). Strange behaviour on
embedded '\0' may worry not only pedants but also folk who are recipients
of data files created by J. Random Boofhead III and friends.
===
Cheers,
John
More information about the Csv
mailing list