How to compare words from .txt file against words in .xlsx file via Python? I will then extract these words by writing it to a new .xls file

MRAB python at mrabarnett.plus.com
Sun Aug 4 19:21:36 EDT 2019


On 2019-08-05 00:10, A S wrote:
> Oh... By set did you mean by using python function set(variable) as 
> something?
>
> So sorry for bothering you..
>
Make it a set (outside the loop):

     dictionary = set()

and then add the words to it (inside the loop):

     dictionary.add(cell_range.value)

(Maybe also rename the variable to, say, "words_wanted", because calling 
it "dictionary" when it's not a dictionary (dict) could be confusing...)

> On Mon, 5 Aug 2019, 6:52 am A S, <aishan0403 at gmail.com 
> <mailto:aishan0403 at gmail.com>> wrote:
>
>     Previously I had tried many methods and using set was one of them
>     but it didn't work out either.. I even tried to append it to a
>     list but it's not working out..
>
>     On Mon, 5 Aug 2019, 2:29 am MRAB, <python at mrabarnett.plus.com
>     <mailto:python at mrabarnett.plus.com>> wrote:
>
>         On 2019-08-04 18:53, A S wrote:
>         > Hi Mrab,
>         >
>         > Thank you so much for your detailed response, I really really
>         > appreciate it as I have been constantly trying to seek help
>         regarding
>         > this issue.
>         >
>         > Yes, I figured that the dictionary is only capturing the
>         last value :(
>         > I've been trying to get it to capture and store all the
>         values to
>         > memory in python but it's not working..
>         >
>         > Are there any improvements that I could make to allow my
>         code to work?
>         >
>         > I would be truly grateful if you could provide further
>         insights on this..
>         >
>         > Thank you so much.
>         >
>         Make it a set and then add the words to it.
>
>         >
>         > On Mon, 5 Aug 2019, 1:45 am MRAB,
>         <python at mrabarnett.plus.com <mailto:python at mrabarnett.plus.com>
>         > <mailto:python at mrabarnett.plus.com
>         <mailto:python at mrabarnett.plus.com>>> wrote:
>         >
>         >     On 2019-08-04 09:29, aishan0403 at gmail.com
>         <mailto:aishan0403 at gmail.com>
>         >     <mailto:aishan0403 at gmail.com
>         <mailto:aishan0403 at gmail.com>> wrote:
>         >     > I want to compare the common words from multiple .txt
>         files
>         >     based on the words in multiple .xlsx files.
>         >     >
>         >     > Could anyone kindly help with my code? I have been
>         stuck for
>         >     weeks and really need help..
>         >     >
>         >     > Please refer to this link:
>         >     >
>         >
>         https://stackoverflow.com/questions/57319707/how-to-compare-words-from-txt-file-against-words-in-xlsx-file-via-python-i-wi
>         >     >
>         >     > Any help is greatly appreciated really!!
>         >     >
>         >     First of all, in this line:
>         >
>         >          folder_path1 =
>         os.chdir("C:/Users/xxx/Documents/xxxx/Test
>         >     python dict")
>         >
>         >     it changes the current working directory (not a
>         problem), but 'chdir'
>         >     returns None, so from that point 'folder_path1' has the
>         value None.
>         >
>         >     Then in this line:
>         >
>         >          for file in os.listdir(folder_path1):
>         >
>         >     it's actually doing:
>         >
>         >          for file in os.listdir(None):
>         >
>         >     which happens to work because passing it None means to
>         return the
>         >     names
>         >     in the current directory.
>         >
>         >     Now to your problem.
>         >
>         >     This line:
>         >
>         >          dictionary = cell_range.value
>         >
>         >     sets 'dictionary' to the value in the spreadsheet cell,
>         and you're
>         >     doing
>         >     it each time around the loop. At the end of the loop,
>         'dictionary'
>         >     will
>         >     be set to the _last_ such value. You're not collecting
>         the value, but
>         >     merely remembering the last value.
>         >
>         >     Looking further on, there's this line:
>         >
>         >          if txtwords in dictionary:
>         >
>         >     Remember, 'dictionary' is the last value (a string), so
>         that'll be
>         >     True
>         >     only if 'txtwords' is a substring of the string in
>         'dictionary'.
>         >
>         >     That's why you're seeing only one match.
>         >
>



More information about the Python-list mailing list