CSV(???)

John Machin sjmachin at lexicon.net
Fri Feb 23 10:31:35 EST 2007


On Feb 23, 10:11 pm, David C. Ullrich <ullr... at math.okstate.edu>
wrote:
> Is there a csvlib out there somewhere?

I can make available the following which should be capable of running
on 1.5.2 -- unless they've suffered bitrot :-)

(a) a csv.py which does simple line-at-a-time hard-coded-delimiter-etc
pack and unpack i.e. very similar to your functionality *except* that
it doesn't handle newline embedded in a field. You may in any case be
interested to see a different way of writing this sort of thing: my
unpack does extensive error checking; it uses a finite state machine
so unexpected input in any state is automatically an error.

(b) an extension module (i.e. written in C) with the same API. The
python version (a) imports and uses (b) if it exists.

(c) an extension module which parameterises everything including the
ability to handle embedded newlines.

The two extension modules have never been compiled & tested on other
than Windows but they both should IIRC be compilable with both gcc
(MinGW) and the free Borland 5.5 compiler -- in other words vanilla C
which should compile OK on Linux etc.

If you are interested in any of the above, just e-mail me.

>
> And/or does anyone see any problems with
> the code below?
>
> What csvline does is straightforward: fields
> is a list of strings. csvline(fields) returns
> the strings concatenated into one string
> separated by commas. Except that if a field
> contains a comma or a double quote then the
> double quote is escaped to a pair of double
> quotes and the field is enclosed in double
> quotes.
>
> The part that seems somewhat hideous is
> parsecsvline. The intention is that
> parsecsvline(csvline(fields)) should be
> the same as fields. Haven't attempted
> to deal with parsecsvline(data) where
> data is in an invalid format - in the
> intended application data will always
> be something that was returned by
> csvline.

"Always"? Famous last words :-)

> It seems right after some
> testing... also seems blechitudinous.

I agree that it's bletchworthy, but only mildly so. If it'll make you
feel better, I can send you as a yardstick csv pack and unpack written
in awk -- that's definitely *not* a thing of beauty and a joy
forever :-)

I presume that you don't write csvline() output to a file, using
newline as a record terminator and then try to read them back and pull
them apart with parsecsvline() -- such a tactic would of course blow
up on the first embedded newline. So as a matter of curiosity, where/
how are you storing multiple csvline() outputs?

>
> (Um: Believe it or not I'm _still_ using
> python 1.5.7. So comments about iterators,
> list comprehensions, string methods, etc
> are irrelevent. Comments about errors in
> the algorithm would be great. Thanks.)

1.5.7 ?
[big snip]

Cheers,
John




More information about the Python-list mailing list