[Tutor] What is the best way to count the number of lines in a huge file?
dman
dsh8290@rit.edu
Thu, 6 Sep 2001 17:15:17 -0400
On Thu, Sep 06, 2001 at 04:22:22PM +0200, Remco Gerlich wrote:
| On 0, dman <dsh8290@rit.edu> wrote:
| > On Thu, Sep 06, 2001 at 02:50:04AM -0400, Ignacio Vazquez-Abrams wrote:
| > | On Thu, 6 Sep 2001, HY wrote:
<major snippage>
| > | a=None
| > | n=0
| > | while not a=='':
| > | a=file.read(262144) # season to taste
| > | n+=a.count('\n')
| >
| > Just beware of Mac's. You won't find a single \n in a Mac text file
| > because they use \r instead. FYI in case you have to deal with a text
| > file that came from a Mac.
|
| If the file is opened in text mode, then that is Python's problem (actually
| the C library's), not yours. From the Python side it all looks like \n,
| whatever the system.
That's cool -- lines always end properly in python :-). Although I
don't think that will work on a unix box -- "text" mode is no
different from "binary" mode. I don't think this theory works on a
windows box either, when opening a mac file. My reasoning is that I
had to tweak an HTML file recently. I opened the file (first on the
FreeBS web host) using vim -- it reported "noeol" and everything was
on one huge line. I got the same effect when I copied it to my win2k
workstation and opened it with vim. I used a little bit of
substitution (I was going to use python which is why I copied the file
to my windows box, but then I learned how to use vim's subsitute
strings properly) to replace all \r with \n and make the file a
"proper" (unix) text file.
-D