Programming challenge: wildcard exclusion in cartesian products

Dirk Thierbach dthierbach at usenet.arcornews.de
Wed Mar 22 09:14:00 EST 2006


[Had to drop alt.comp.lang.haskell, otherwise my newsserver doesn't accept it]

Dinko Tenev <dinko.tenev at gmail.com> wrote:
> OK, here's a case that will make your program run in exponential time:
> S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting
> ugly as soon as n is 15 or so.  Note that S^n - W = { a^n, b^n }.

> In general, whenever all the patterns in the set match against the last
> position, your current implementation is guaranteed to have to sift
> through all of S^n.  I'd say the very idea of checking against a
> blacklist is fundamentally flawed, as far as performance is concerned.

If more time during preprocessing is allowed, another idea is to
treat the wildcard expressions as regular expressions, convert
each into a finite state machine, construct the "intersection" of
all these state machines, minimize it and then swap final and non-final
states. Then you can use the resulting automaton to efficiently 
enumerate S^n - W. In the above case, the resulting FSM would have just 
three states.

And it doesn't really matter what language you use to implement this
algorithm, it's the idea that counts. Notation aside, all
implementations will be quite similar.

- Dirk






More information about the Python-list mailing list