pattern
Sharan Basappa
sharan.basappa at gmail.com
Sat Jun 16 14:59:53 EDT 2018
Dear Cameron,
This is so kind of you. Thanks for spending time to explain the code.
It did help a lot. I did go back and brush up lists & dictionaries.
At this point, I think, I need to go back and brush up Python from the start.
So, I will do that first.
On Friday, 15 June 2018 09:12:22 UTC+5:30, Cameron Simpson wrote:
> On 14Jun2018 20:01, Sharan Basappa <sharan.basappa at gmail.com> wrote:
> >> >Can anyone explain to me the purpose of "pattern" in the line below:
> >> >
> >> >documents.append((w, pattern['class']))
> >> >
> >> >documents is declared as a list as follows:
> >> >documents.append((w, pattern['class']))
> >>
> >> Not without a lot more context. Where did you find this code?
> >
> >I am sorry that partial info was not sufficient.
> >I am actually trying to implement my first text classification code and I am referring to the below URL for that:
> >
> >https://machinelearnings.co/text-classification-using-neural-networks-f5cd7b8765c6
>
> Ah, ok. It helps to include some cut/paste of the relevant code, though the URL
> is a big help.
>
> The wider context of the code you recite looks like this:
>
> words = []
> classes = []
> documents = []
> ignore_words = ['?']
> # loop through each sentence in our training data
> for pattern in training_data:
> # tokenize each word in the sentence
> w = nltk.word_tokenize(pattern['sentence'])
> # add to our words list
> words.extend(w)
> # add to documents in our corpus
> documents.append((w, pattern['class']))
>
> and the training_data is defined like this:
>
> training_data = []
> training_data.append({"class":"greeting", "sentence":"how are you?"})
> training_data.append({"class":"greeting", "sentence":"how is your day?"})
> ... lots more ...
>
> So training data is a list of dicts, each dict holding a "class" and "sentence"
> key. The "for pattern in training_data" loop iterates over each item of the
> training_data. It calls nltk.word_tokenize on the 'sentence" part of the
> training item, presumably getting a list of "word" strings. The documents list
> gets this tuple:
>
> (w, pattern['class'])
>
> added to it.
>
> In this way the documents list ends up with tuples of (words, classification),
> with the words coming from the sentence via nltk and the classification coming
> straight from the train item's "class" value.
>
> So at the end of the loop the documents array will look like:
>
> documents = [
> ( ['how', 'are', 'you'], 'greeting' ),
> ( ['how', 'is', 'your', 'day', 'greeting' ),
> ]
>
> and so forth.
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
More information about the Python-list
mailing list