Recursive generator for combinations of a multiset?
MRAB
python at mrabarnett.plus.com
Fri Nov 22 23:23:42 EST 2013
On 23/11/2013 00:58, John O'Hagan wrote:
> On Thu, 21 Nov 2013 12:59:26 -0800
> Dan Stromberg <drsalists at gmail.com> wrote:
>
>> On Wed, Nov 20, 2013 at 10:46 PM, John O'Hagan
>> <research at johnohagan.com>wrote:
>>
>> >
>> > Short story: the subject says it all, so if you have an answer
>> > already, fire away. Below is the long story of what I'm using it
>> > for, and why I think it needs to be recursive. It may even be of
>> > more general interest in terms of filtering the results of
>> > generators.
>> >
>>
>> I think you probably need permutations rather than combinations.
>>
>> Also, I think you'll need to form a word (partitioned off by spaces),
>> and then check it against a set containing /usr/share/dict/words
>> before recursing for the remainder of the sentence - this should
>> speed things up a LOT.
>
> Thanks for the reply. If I understand you correctly, you are suggesting
> permuting the input _characters_ to form words and then seeing if
> they exist, as opposed to my approach of combining known words and
> seeing if they are anagrams. (Permutations of words would not help find
> anagrams as they merely change the word order). Here is an attempt at
> that:
>
> def anagrams(partition, input_string):
> """Find anagrams which fit given partition of input string length"""
> if not partition:
> yield (), input_string
> return
> for words, checkstring in anagrams(partition[:-1], input_string):
> for word in itertools.permutations(checkstring, partition[-1]):
> word = ''.join(word)
> if word in WORDS: #WORDS is collection of dictionary words
> newstring = checkstring
> for l in word:
> newstring = newstring.replace(l, '' , 1)
> yield words + (word,), newstring
>
> There are two problems with this. If there are repeated characters in
> the input, redundant results are produced; a multiset-permutation
> algorithm would fix this. But the main problem is it is incredibly
> slow: on my run-of-the-mill laptop, it chokes on anything longer than
> about 10 characters, spending most of its time rejecting non-words.
>
> Or have I misunderstood your suggestion?
>
If you want to know how to get unique permutations, have a look here:
http://mail.python.org/pipermail/python-ideas/2013-October/023610.html
More information about the Python-list
mailing list