LCS for word ? or array intersection ?
David Eppstein
eppstein at ics.uci.edu
Sun Jan 26 12:04:48 EST 2003
gregtehrig at yahoo.fr (Greg Tehrig) wrote in message
> > i've got a question for the guys from here, i think this might be done
> > by something like decorate/undecorate and LCS (Longest Common
> > Sequence) but perhaps you had a better idea than mine.
> >
> > for example :
> > s1 = """this is a line containing an example"""
> > s2 = """this is another sentence containing 12 words and it is another example"""
> >
> > and need following output:
> > ["""this is""", """containing""", """example"""]
> > ["""a line""", """an example"""]
Not sure whether this is exactly what you're looking for, but:
>>> import lcs
>>> s1 = """this is a line containing an example"""
>>> s2 = """this is another sentence containing 12 words and it is
another example"""
>>> lcs.longestCommonSubsequence(s1.split(" "), s2.split(" "))
['this', 'is', 'containing', 'example']
This is with the lcs code I had in
http://www.ics.uci.edu/~eppstein/161/python/lcs.py
Undecorating to get the words of s1 not in the lcs (your second output)
looks straightforward enough...
--
David Eppstein UC Irvine Dept. of Information & Computer Science
eppstein at ics.uci.edu http://www.ics.uci.edu/~eppstein/
More information about the Python-list
mailing list