How to count lines in a text file ?

Ling Lee janimal at mail.trillegaarden.dk
Mon Sep 20 15:58:30 EDT 2004


Thanks for explaining it that well, really makes sense now :)

Cheers....
"Andrew Dalke" <adalke at mindspring.com> wrote in message 
news:ekE3d.648$g42.95 at newsread3.news.pas.earthlink.net...
> Ling Lee wrote:
>> 2) I made the first part like this:
>>
>> in_file = raw_input("What is the name of the file you want to open: ")
>> in_file = open("test.txt","r")
>> text = in_file.read()
>
> You have two different objects related to the file.
> One is the filename (the result of calling raw_input) and
> the other is the file handle (the result of calling open).
> You are using same variable name for both of them.  You
> really should make them different.
>
> First you get the file name and reference it by the variable
> named 'in_file'.  Next you use another filename ("test.txt")
> for the open call.  This returns a file handle, but not
> a file handle to the file named in 'in_file'.
>
> You then change things so that 'in_file' no longer refers
> to the filename but now refers to the file handle.
>
> A nicer solution is to use one variable name for the name
> (like "in_filename") and another for the handle (you can
> keep "in_file" if you want to).  In the following I
> reformatted it so the example fits in under 80 colums
>
>    in_filename = raw_input("What is the name of the file "
>                            "you want to open: ")
>    in_file = open(in_filename,"r")
>    text = in_file.read()
>
>
> Now the in_file.read() reads all of the file into memory.  There
> are several ways to count the number of lines.  The first is
> to count the number of newline characters.  Because the newline
> character is special, it's most often written as what's called
> an escape code.  In this case, "\n".  Others are backspace ("\b")
> and beep ("\g"), and backslash ("\\") since otherwise there's
> no way to get the single character "\".
>
> Here's how to cound the number of newlines in the text
>
> num_lines = text.count("\n")
>
> print "There are", num_lines, "in", in_filename
>
>
> This will work for almost every file except for one where
> the last line doesn't end with a newline.  It's rare, but
> it does happen.  To fix that you need to see if the
> text ends with a newline and if it doesn't then add one
> more to the count
>
>
> num_lines = text.count("\n")
> if not text.endswith("\n"):
>   num_lines = num_lines + 1
>
> print "There are", num_lines, "in", in_filename
>
>
>> 3) I think that I have to use a for loop ( something like
>> for line in text: count +=1)
>
> Something like that will work.  When you say "for xxxx in string"
> it loops through every character in the string, and not
> every line.  What you need is some way to get the lines.
>
> One solution is to use the 'splitlines' method of strings.
> This knows how to deal with the "final line doesn't end with
> a newline" case and return a list of all the lines.  You
> can use it like this
>
>   count = 0
>   for line in text.splitlines():
>     count = count + 1
>
> or, since splitlines() returns a list of lines you can
> also do
>
>   count = len(text.splitlines())
>
> It turns out that reading lines from a file is very common.
> When you say "for xxx in file" it loops through every line
> in the file.  This is not a list so you can't say
>
>   len(open(in_filename, "r"))  # DOES NOT WORK
>
> instead you need to have the explicit loop, like this
>
>   count = 0
>   for line in open(in_filename, "r")):
>     count = count + 1
>
> An advantage to this approach is that it doesn't read
> the whole file into memory.  That's only a problems
> if you have a large file.  Try counting the number of
> lines in a 1.5 GB file!
>
> By the way, the "r" is the default for the a file open.
> Most people omit it from the parameter list and just use
>
>    open(in_filename)
>
> Hope this helped!
>
> By the way, you might want to look at the "Beginner's
> Guide to Python" page at http://python.org/topics/learn/ .
> It has pointers to resources that might help, including
> the tutor mailing list meant for people like you who
> are learning to program in Python.
>
> Andrew
> dalke at dalkescientific.com 





More information about the Python-list mailing list