[Tutor] Help/suggestion requested

nadeem nan nadeem_559 at yahoo.com
Sun Apr 10 21:11:54 EDT 2022


Hello experts !
My name is Nadeem and I am a novice learning Python through an online course. As part of the end of first module, we have been assigned a small project to take a form of text or paragraph of any size with punctuations included. The goal of the project is 
1. to write a script which will take the given text and add each word from it to a dictionary along with its frequency (no. of times repeated). 
2. Punctuations are to be removed before adding the words to the dictionary.
3. remove common words like 'the, they, are, not, be ,me, it, is, in' etc. from the dictionary.
My problem:
I am not being able to remove the common words from the dictionary. I have defined 2 dicts, word_dictionary which will store the words and its frequency as key:value pair and other final_dictionary which will store the words and its frequency without common words. To do this I have defined a list 'less_desired_words' containing the common words and used nested loops to compare the common words and if there is no match, to add the word to the final_dictionary. However this is not working (I may be not using it properly or doing something wrong).
Can you please advise, how I would be able to compare and add the words I wish to keep in the final_dictionary. The full code is as below. 
#Initialise text as stringtext = '''"I told you already," the curator stammered, kneeling defenseless on the floor of the gallery. "Ihave no idea what you are talking about!""You are lying." The man stared at him, perfectly immobile except for the glint in his ghostly eyes."You and your brethren possess something that is not yours."The curator felt a surge of adrenaline. How could he possibly know this?"Tonight the rightful guardians will be restored. Tell me where it is hidden, and you will live." Theman leveled his gun at the curator's head. "Is it a secret you will die for?"Saunière could not breathe.'''
#create a dictionary to store the words and their frequencies as key : value pair.word_dictionary = {}#create a dictinary to store the words without common words.final_dictionary = {}#split and store the sample text in a listtext_list = text.split()print(text_list)#define unwanted characters as stringunwanted_characters = '''.,/?@:;{}[]_ '"-+=!£$%^&*()~<>¬`'''
#define less desired or common words as a listless_desired_words = ['the', 'a', 'they', 'are', 'i', 'me', 'you', 'we', 'there', 'their', 'can', 'our', 'is', 'not', 'for', 'in', 'on', 'no', 'have', 'he', 'she', 'and', 'your', 'him', 'her']
#iterate through text_list and remove the punctuations and convert to lower case words        for word in text_list:    for character in unwanted_characters:        word = word.replace(character, "")        word = word.lower()
#count the words in the list and add to dictionary with their frequecy as key:value pair                if word in word_dictionary:        frequency = word_dictionary[word]        word_dictionary[word] = frequency + 1            else:        word_dictionary[word] = 1
print(word_dictionary)
#remove the less desired or common words and add the remaining words to the final dictionary.       for word, frequent in word_dictionary.items():    for notword in less_desired_words:        if word != notword:            final_dictionary[word] = frequent    

#print(word_dictionary)print(final_dictionary)

Thanking you in advance,
Nadeem  
                                        


More information about the Tutor mailing list