counting how often the same word appears in a txt file...But my code only prints the last line entry in the txt file

dgcosgrave at gmail.com dgcosgrave at gmail.com
Wed Dec 19 06:34:06 EST 2012


On Thursday, December 20, 2012 12:03:21 AM UTC+13, Steven D'Aprano wrote:
> On Wed, 19 Dec 2012 02:45:13 -0800, dgcosgrave wrote:
> 
> 
> 
> > Hi Iam just starting out with python...My code below changes the txt
> 
> > file into a list and add them to an empty dictionary and print how often
> 
> > the word occurs, but it only seems to recognise and print the last entry
> 
> > of the txt file. Any help would be great.
> 
> > 
> 
> > tm =open('ask.txt', 'r')
> 
> > dict = {}
> 
> > for line in tm:
> 
> > 	line = line.strip()
> 
> > 	line = line.translate(None, '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')
> 
> >       line = line.lower()
> 
> > 	list = line.split(' ')
> 
> 
> 
> Note: you should use descriptive names. Since this is a list of WORDS, a 
> 
> much better name would be "words" rather than list. Also, list is a built-
> 
> in function, and you may run into trouble when you accidentally re-use 
> 
> that as a name. Same with using "dict" as you do.
> 
> 
> 
> Apart from that, so far so good. For each line, you generate a list of 
> 
> words. But that's when it goes wrong, because you don't do anything with 
> 
> the list of words! The next block of code is *outside* the for-loop, so 
> 
> it only runs once the for-loop is done. So it only sees the last list of 
> 
> words.
> 
> 
> 
> > for word in list:
> 
> 
> 
> The problem here is that you lost the indentation. You need to indent the 
> 
> "for word in list" (better: "for word in words") so that it starts level 
> 
> with the line above it.
> 
> 
> 
> > 		if word in dict:
> 
> > 			count = dict[word]
> 
> > 			count += 1
> 
> > 			dict[word] = count
> 
> 
> 
> This bit is fine.
> 
> 
> 
> > else:
> 
> > 	dict[word] = 1
> 
> 
> 
> But this fails for the same reason! You have lost the indentation.
> 
> 
> 
> A little-known fact: Python for-loops take an "else" block too! It's a 
> 
> badly named statement, but sometimes useful. You can write:
> 
> 
> 
> 
> 
> for value in values:
> 
>     do_something_with(value)
> 
>     if condition:
> 
>         break  # skip to the end of the for...else
> 
> else:
> 
>     print "We never reached the break statement"
> 
> 
> 
> So by pure accident, you lined up the "else" statement with the for loop, 
> 
> instead of what you needed:
> 
> 
> 
> for line in tm:
> 
>     ... blah blah blah
> 
>     for word in words:
> 
>         if word in word_counts:  # better name than "dict"
> 
>             ... blah blah blah
> 
>         else:
> 
>             ...
> 
> 
> 
> 
> 
> > for word, count in dict.iteritems():
> 
> > 	print word + ":" + str(count)
> 
> 
> 
> And this bit is okay too.
> 
> 
> 
> 
> 
> Good luck!
> 
> 
> 
> 
> 
> -- 
> 
> Steven

Thanks Steven appreciate great info for future coding. i have change names to be more decriptive and corrected the indentation... all works! cheers



More information about the Python-list mailing list