* separated values

Cliff Wells logiplexsoftware at earthlink.net
Tue Jan 15 16:45:46 EST 2002


On Tue, 15 Jan 2002 15:31:06 -0600
Skip Montanaro wrote:

> 
>     >> I suggest Dave Cole's csv module (fast, easy to use, appears
>     >> complete) be added to the core:
> 
>     Cliff> How dare you. ;)
> 
>     Cliff> www.sf.net/projects/python-dsv/
> 
> I wasn't even aware of your package...  Aside from the heuristic
delimiter
> bit, python-dsv sounds (functionally) about like Dave Cole's csv module. 
As
> I mentioned before, performance is a significant issue for me because I
both
> generate and parse some very large CSV files.  The fact that the csv
module
> is written in C makes a difference.
> 
> how-about-a-shootout?-ly, y'rs,
> 
> Skip

I've heard (I haven't looked for myself) that Dave's doesn't support
embedded newlines which is a big killer for many CSV files (correct me if
I'm wrong - I'm not trying to slag someone else's work).  I'd be interested
in performance tests, but I don't have any big CSV files lying around any
more... perhaps you could test?

One of the drawback to my current implementation is that it loads the
entire file into a list of lists rather than iterating over it.  I'm
thinking about rewriting it to allow iteration (which would be useful for
huge files).

-- 
Cliff Wells
Software Engineer
Logiplex Corporation (www.logiplex.net)
(503) 978-6726 x308
(800) 735-0555 x308




More information about the Python-list mailing list