[Tutor] What is the best way to count the number of lines in a huge file?

dman dsh8290@rit.edu
Thu, 6 Sep 2001 17:15:17 -0400


On Thu, Sep 06, 2001 at 04:22:22PM +0200, Remco Gerlich wrote:
| On  0, dman <dsh8290@rit.edu> wrote:
| > On Thu, Sep 06, 2001 at 02:50:04AM -0400, Ignacio Vazquez-Abrams wrote:
| > | On Thu, 6 Sep 2001, HY wrote:
<major snippage>
| > | a=None
| > | n=0
| > | while not a=='':
| > |   a=file.read(262144) # season to taste
| > |   n+=a.count('\n')
| > 
| > Just beware of Mac's.  You won't find a single \n in a Mac text file
| > because they use \r instead.  FYI in case you have to deal with a text
| > file that came from a Mac.
| 
| If the file is opened in text mode, then that is Python's problem (actually
| the C library's), not yours. From the Python side it all looks like \n,
| whatever the system.

That's cool -- lines always end properly in python :-).  Although I
don't think that will work on a unix box -- "text" mode is no
different from "binary" mode.  I don't think this theory works on a
windows box either, when opening a mac file.  My reasoning is that I
had to tweak an HTML file recently.  I opened the file (first on the
FreeBS web host) using vim -- it reported "noeol" and everything was
on one huge line.  I got the same effect when I copied it to my win2k
workstation and opened it with vim.  I used a little bit of
substitution (I was going to use python which is why I copied the file
to my windows box, but then I learned how to use vim's subsitute
strings properly) to replace all \r with \n and make the file a
"proper" (unix) text file.

-D