best way to align words?
Noah Rawlins
noah.rawlins at comcast.net
Thu Nov 30 21:25:57 EST 2006
Robert R. wrote:
> Hello,
>
> i would like to write a piece of code to help me to align some sequence
> of words and suggest me the ordered common subwords of them
>
> s0 = "this is an example of a thing i would like to have".split()
> s1 = "another example of something else i would like to have".split()
> s2 = 'and this is another " example " but of something ; now i would
> still like to have'.split()
> ...
> alist = (s0, s1, s2)
>
> result should be : ('example', 'of', 'i', 'would', 'like', 'to', 'have'
>
> but i do not know how should i start, may be have you a helpful
> suggestion?
> a trouble i have if when having many different strings my results tend
> to be nothing while i still would like to have one of the, or maybe,
> all the best matches.
>
> best.
>
Your requirements are a little vague... how are these three strings handled?
s1 = "hello there dudes"
s2 = "dudes hello there"
s3 = "there dudes hello"
they all share the 3 words, but what order do you want them back?
here is a simplistic approach using sets that results in a list of words
that are in all strings ordered arbitrarily by their order in the first
string ( it also doesn't worry about matches (or lack of) due to
punctuation and case and crap like that)
>>> strList = []
>>> strList.append('this is an example of a thing i would like to have')
>>> strList.append('another example of something else i would like to
have')
>>> strList.append('and this is another " example " but of something ;
now i would still like to have')
>>> [word for word in strList[0].split() if word in reduce(lambda x, y:
x.intersection(y), [set(str.split()) for str in strList])]
['example', 'of', 'i', 'would', 'like', 'to', 'have']
but you still have issues with mutiple matches and how they are handled
etc...
noah
More information about the Python-list
mailing list