wildcard exclusion in cartesian products

Sun Mar 26 16:48:18 EST 2006

wkehowski at cox.net wrote:
> The python code below is adapted from a Haskell program written by
> Tomasz
> Wielonka on the comp.lang.functional group. It's more verbose than his
> since I wanted to make sure I got it right.
> 
> http://groups.google.com/group/comp.lang.functional/browse_frm/thread...
> 
> Does anyone know how to turn it into a module (or whatever it's called)
> so that I can put it in a
> loop and not have to generate the whole list? I've already tried
> without success.

I know next to nothing about Haskell, but I believe that it evaluates sequences 
lazily by default.  Python can certainly do lazy evaluation, but you'll have to 
ask for it explicitly.
> 
> The program solves the following problem: generate the subset X of the
> cartesian product S^n that excludes n-tuples defined by wildcards. For
> example, if I want to exclude from [1,2,3]^3 the wc's [1,"*",2] and
> ["*",3,"*",3,"*"], where "*" stands for a sequence of zero or more
> elements of [1,2,3], then I just type
> 

[I think your example and test case may have become mixed up]
...

Here's an iterative "lazy" solution.  While we're at it, why not use re for 
pattern matching, rather than rolling our own matcher (this does restrict the 
set members to characters, maybe that's OK):

import re

def cp_excl_wc(alpha, N, *exclusions):
     RE0, RE1 = compile_re(exclusions)
     def add_one(outer, RE_PATTS):
         if RE_PATTS:
             for partial in outer:
                 for a in alpha:
                     next = partial + a
                     if not RE_PATTS.match(next):
                         yield next
         else: # if there's no pattern to match at this length
               # don't waste time trying
             for partial in outer:
                 for a in alpha:
                     yield partial+a
     cpgen = "",
     # RE0 holds the patterns that match at the beginning, so are tested
     # against all iterations less than full length
     for n in range(N-1):
         cpgen = add_one(cpgen, RE0)
     # For the last place, test the full-length patterns
     cpgen = add_one(cpgen, RE1)
     return cpgen

def compile_re(exclusions):
     re0 = [] # patterns that can be matched anywhere
     re1 = [] # patterns that match only the full-length
     for pattern in exclusions:
         pattern = pattern.replace("*", ".*")
         if pattern.endswith(".*"):
             re0.append(pattern)
         re1.append(pattern+"$")

     re0 = re0 and re.compile("|".join(re0)) or None
     re1 = re1 and re.compile("|".join(re1)) or None
     return re0, re1

 >>> nm = cp_excl_wc("123",3,"1*2","3*1")
 >>> nm.next()
'111'
 >>> nm.next()
'113'
 >>> nm.next()
'121'
 >>> nm.next()
'123'
 >>> list(nm)
['131', '133', '211', '212', '213', '221', '222', '223', '231', '232', '233', 
'312', '313', '322', '323', '332', '333']
 >>>

Is this the sort of thing you had in mind?

Michael