counting how often the same word appears in a txt file...But my code only prints the last line entry in the txt file

Steven D'Aprano steve+comp.lang.python at pearwood.info
Wed Dec 19 06:03:21 EST 2012


On Wed, 19 Dec 2012 02:45:13 -0800, dgcosgrave wrote:

> Hi Iam just starting out with python...My code below changes the txt
> file into a list and add them to an empty dictionary and print how often
> the word occurs, but it only seems to recognise and print the last entry
> of the txt file. Any help would be great.
> 
> tm =open('ask.txt', 'r')
> dict = {}
> for line in tm:
> 	line = line.strip()
> 	line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
>       line = line.lower()
> 	list = line.split(' ')

Note: you should use descriptive names. Since this is a list of WORDS, a 
much better name would be "words" rather than list. Also, list is a built-
in function, and you may run into trouble when you accidentally re-use 
that as a name. Same with using "dict" as you do.

Apart from that, so far so good. For each line, you generate a list of 
words. But that's when it goes wrong, because you don't do anything with 
the list of words! The next block of code is *outside* the for-loop, so 
it only runs once the for-loop is done. So it only sees the last list of 
words.

> for word in list:

The problem here is that you lost the indentation. You need to indent the 
"for word in list" (better: "for word in words") so that it starts level 
with the line above it.

> 		if word in dict:
> 			count = dict[word]
> 			count += 1
> 			dict[word] = count

This bit is fine.

> else:
> 	dict[word] = 1

But this fails for the same reason! You have lost the indentation.

A little-known fact: Python for-loops take an "else" block too! It's a 
badly named statement, but sometimes useful. You can write:


for value in values:
    do_something_with(value)
    if condition:
        break  # skip to the end of the for...else
else:
    print "We never reached the break statement"

So by pure accident, you lined up the "else" statement with the for loop, 
instead of what you needed:

for line in tm:
    ... blah blah blah
    for word in words:
        if word in word_counts:  # better name than "dict"
            ... blah blah blah
        else:
            ...


> for word, count in dict.iteritems():
> 	print word + ":" + str(count)

And this bit is okay too.


Good luck!


-- 
Steven



More information about the Python-list mailing list