matching and extracting...

Andrew Dalke adalke at mindspring.com
Tue Jan 14 18:02:13 EST 2003


Shagshag wrote:
> Here is my problem : say i have items ix, some are "to discover" (must
> be extracted "-x"), some are "needed" (must be present "-p") and some
> are indifferent ("-?"). for example, i have sequence like :
> 
> s = "a0 a1 a2 a3 a4 a5 a6 a7 a8"
> 
> i would like to check if my sequence is matching sequence like :
> 
> m = "a0-p i1-x i2-x i3-? a4-p i5-x i6-? i7-x i8-?"
> 
> and get result like :
> 
> m is matching s, i1 is a1, i2 is a3, i5 is a5, i7 is a7 (in python a
> "true" and a dict)

This took a while for me to figure it out.  Couldn't "i8" be
matched to "a8"?

The approach I would suggest is to break them apart into words
(using the "split" method of strings) then do a recusive comparison,
as in

def recursive_match(s_words, m_words):
    if not s_words:
      ... make sure the remaining m words are "-?"
      ... if not, return None
      return ()
    if not m_words:
      ... then stop searching
      return ()
    if m_words[0] ends in "-p":
      make sure it matches s_words[0]
      x = recursive_match(s_words[1:], m_words[1:])
      if x is None:
        # there wasn't a submatch, so we don't match
        return None
      # return our match plus the submatch
      return (s_words[0], m_words[0]) + x
    if m_words[0] ends in "-x":
      x = recursive_match(s_words[1:], m_words[1:])
      if x is None:
        return None
      return (s_words[0], m_words[0]) + x
    if m_words[0] ends in "-?":
      for i in range(1, len(s_words)):
        x = recursive_match(s_words[i:], m_words[1:])
        if x is not None:
          return (s_words[i-1], m_words[0]) + x
      # no matches
      return None
   raise TypeError("unknown match term: %r" % (m_words[0],)

def do_match(s, m):
   m = recursive_match(s.split(), m.split()
   if m is None:
      .... no match
   else:
      m = m[:-1]  # the last term is an empty tuple
      .... the match is a list of tuples of the form (s_word, m_word)

You'll need to fix the Python yourself.

					Andrew
					dalke at dalkescientific.com








More information about the Python-list mailing list