Index of entity in List with a Condition

Cameron Simpson cs at cskk.id.au
Tue Jun 12 21:00:28 EDT 2018


On 11Jun2018 13:48, Subhabrata Banerjee <subhabangalore at gmail.com> wrote:
>I have the following sentence,
>
>"Donald Trump is the president of United States of America".
>
>I am trying to extract the index 'of', not only for single but also
>for its multi-occurance (if they occur), from the list of words of the
>string, made by simply splitting the sentence.
> index1=[index for index, value in enumerate(words) if value == "of"],
>where words=sentence.split()
>
>I could do this part more or less nicely.
>
>But I am trying to say if the list of words has the words "United"
>and "States" and it has "of " in the sentence then the previous
>word before of is, president.
>
>I am confused how may I write this, if any one may help it.

You will probably have to drop the list comprehension and go with something 
more elaborate.

Also, lists have an "index" method:

  >>> L = [4,5,6]
  >>> L.index(5)
  1

though it doesn't solve your indexing problems on its own.

I would be inclined to deconstuct the sentence into a cross linked list of 
elements. Consider making a simple class to encapsulate the knowledge about 
each word (totally untested):

  class Word:
    def __init__(word):
      self.word = word

  words = []
  for index, word in sentence.split():
    W = Word(word)
    W.index = index
    words.append(W)
    W.wordlist = words

Now you have a list of Word objects, each of which knows its list position 
_and_ also knows about the list itself, _and_ you have the list of Word objects 
correspnding to your sentence words.

You'll notice we can just hang whatever attributes we like off these "Word" 
objects: we added a .wordlist and .index on the fly. It isn't great formal 
object design, but it makes building things up very easy.

You can add methods or properties to your class, such as ".next":

  @property
  def next(self):
    return self.wordlist[self.index - 1]

and so forth. That will let you write expressions about Words:

  for W in wordlist:
    if W.word == 'of' and W.next.word == 'the' and W.next.next.word == 'United' ...:
      if W.previous.word != 'president':
        ... oooh, unexpected preceeding word! ...

You can see that you could also write methods like "is_preceeded_by":

  def is_preceed_by(self, word2):
    return self.previous.word == word2

and test "W.is_preceeded_by('president')".

In short, write out what you would like to express. Then write methods that 
implement the smaller parts of what you just wrote.

Cheers,
Cameron Simpson <cs at cskk.id.au>



More information about the Python-list mailing list