need an idea, recognize sequence, fsm or genetic ?

Martin DeMello martindemello at yahoo.com
Thu Sep 2 05:42:26 EDT 2004


Joh <joh12005 at yahoo.fr> wrote:
> 
> also i think this is related to some kind of finite state automata,
> but i do not know how to build automatically massive parallel FSM (i
> can have many many sequences and huge amount of sentences)

here's a naive but clean way to do it

Write a finite state machine class that scans the sentence for a single
sequence. Then instantiate one of them per sequence, and run the
following loop (pseudocode)

for word, index in enumerate sentence
  for fsm in sequences
    fsm.advance(word)
    if fsm.state == END
      results[fsm] += (index - fsm.sequence_length, index)
      fsm.state == START

A possible optimisation is to combine all your sequences into a single
trie; however, this will also require backtracking so I'm not sure how
much faster it'd be in practice.

martin



More information about the Python-list mailing list