The Joys Of Data-Driven Programming

Ben Bacarisse ben.usenet at bsb.me.uk
Sun Aug 21 17:02:32 EDT 2016


Tim Chase <python.list at tim.thechases.com> writes:

> On 2016-08-21 04:53, Rustom Mody wrote:
>> 2. Basic computing theory shows that re-s and dfas are equivalent.
>> Which would one prefer to write/debug? [Thats not a rhetorical
>> question]
>
> I'm curious where REs and DFAs are shown to be equivalent (serious,
> not trying to be snarky).  I can see with no trouble that all REs are
> DFAs, but can't convince myself that *all* DFAs are REs.

All NFAs as well -- the don't have to be deterministic in either
direction.

> Though this
> might also depend on the RE engine.

Yes.  Due to their use in real systems, lots of RE engines have acquired
many bells and whistles.  Some bells so fancy that the resulting "REs"
can recognise whole new classes of language.  In this context, though,
"regular expression" means the expression form of regular grammars
(level 3 in the Chomsky hierarchy).

> Do you have links to resources
> on such an "all REs have equivalent DFAs and all DFAs have equivalent
> REs" logic/argument?  Or maybe my understanding of DFAs is flawed.

This is called Kleene's Theorem.  Searching for that will bring up all
sorts of proofs in various styles.  Here's one of the direction you are
asking about:

http://www.cs.may.ie/staff/jpower/Courses/Previous/parsing/node6.html

The other direction is covered all over the place because it it so often
used inside an RE matching engine to "compile" the RE.

> And as for which I prefer to write, for simple text-processing that
> doesn't involve nesting, I generally prefer REs; but for more complex
> processing, a custom DFA tends to expand a bit more cleanly in my
> experience.

You may be confusing DFAs and pushdown automata.  I say this because of
your reference to nesting.  PDAs can be thought of as either DFAs or
NFAs with a stack so they can recognise languages with nested structure.

-- 
Ben.



More information about the Python-list mailing list