Programming challenge: wildcard exclusion in cartesian products

Dirk Thierbach dthierbach at usenet.arcornews.de
Thu Mar 23 08:43:26 EST 2006


Dinko Tenev <dinko.tenev at gmail.com> wrote:
> Dirk Thierbach wrote:
>> If more time during preprocessing is allowed, another idea is to
>> treat the wildcard expressions as regular expressions, convert
>> each into a finite state machine, construct the "intersection" of
>> all these state machines, minimize it and then swap final and non-final
>> states.

> Given the requirements, did you mean taking the *union* and swapping
> states?  Or maybe swapping states first, and then taking the
> intersection?

Whatever the requirements were. Take your pick. :-)

>> Then you can use the resulting automaton to efficiently
>> enumerate S^n - W. In the above case, the resulting FSM would have just
>> three states.

> I don't see immediately how exactly this is going to work.  Unless I'm
> very much mistaken, a FSA in the classical sense will accept or reject
> only after the whole sequence has been consumed, and this spells
> exponential times.  

Exponential in respect to what? You just recursively walk the
automaton for every word of length n, so at most you'll have to check
every word in S^n once. Hence, the complexity is not worse than the
"generate and test" approach (it's in fact better, since testing is
trivial).

However, if the result set is simple (as in the example), then the
result FSM will have states that won't have transitions for every
letters, so I guess the average case will be a lot better.

> For improved asymptotic complexity in this case,
> you need to be able to at least reject in mid-sequence, 

One can do that if there is no valid transition out of some state.

I guess one could even optimize on that: In the minimal automaton,
every state has some transition sequence that will end up in a final
state. Annotate each state with the minimum number of steps for
such a sequence. While walking the automaton, keep the maximum
number of letters to produce in a variable. If this number is small
then the number in the annotation, stop and backtrace.

This should guarantee that only those states are visited that really
produce some output. 

- Dirk



More information about the Python-list mailing list