finding out the number of rows in a CSV file [Resolved]

norseman norseman at hughes.net
Wed Aug 27 17:51:17 EDT 2008


Peter Otten wrote:
> John S wrote:
> 
>> [OP] Jon Clements wrote:
>>> On Aug 27, 12:54 pm, SimonPalmer <simon.pal... at gmail.com> wrote:
>>>> after reading the file throughthe csv.reader for the length I cannot
>>>> iterate over the rows.  How do I reset the row iterator?
>> A CSV file is just a text file. Don't use csv.reader for counting rows
>> -- it's overkill. You can just read the file normally, counting lines
>> (lines == rows).
> 
> Wrong. A field may have embedded newlines:
> 
>>>> import csv
>>>> csv.writer(open("tmp.csv", "w")).writerow(["a" + "\n"*10 + "b"])
>>>> sum(1 for row in csv.reader(open("tmp.csv")))
> 1
>>>> sum(1 for line in open("tmp.csv"))
> 11
> 
> Peter
> --
> http://mail.python.org/mailman/listinfo/python-list
> 

=============================
Well.....   a semantics's problem here.


A blank line is just an EOL by its self. Yes.
I may want to count these. Could be indicative of a problem.
Besides sum(1 for len(line)>0 in ...)  handles problem if I'm not 
counting blanks and still avoids tossing, re-opening etc...

Again - it's how you look at it, but I don't want EOLs in my dbase 
fields. csv was designed to 'dump' data base fields into text for those 
not affording a data base program and/or to convert between data base 
programs. By the way - has anyone seen a good spread sheet dumper?  One 
that dumps the underlying formulas and such along with the display 
value?  That would greatly facilitate portability, wouldn't it?  (Yeah - 
the receiving would have to be able to read it. But it would be a start 
- yes?)  Everyone got the point?  Just because it gets abused doesn't 
mean ....   Are we back on track?  Number of lines equals number of 
reads - which is what was requested. No bytes magically disappearing. No 
slight of hand, no one dictating how to or what with ....

The good part is everyone who reads this now knows two ways to approach 
the problem and the pros/cons of each. No loosers.



Steve
norseman at hughes.net



More information about the Python-list mailing list