Has anyone implemented BASIC in Python?

Mon Aug 23 11:10:53 EDT 2004

In article <9h0ji0lvc0be65nuvcugaa3vsq5mcl3ck8 at 4ax.com>,
 Andrea Griffini <agriff at tin.it> wrote:

> Writing parsers, interpreters and compilers is a lot simpler than
> many do think. Of course the evil is in the details and getting
> something that comes even just close to (just for example ;-) )
> python is a *LOT* harder.
> 
> [...]
> 
> I never use yacc/lex, I just prefer hand-coding the parsers.
> 
> I never felt as pressing the problem of the speed of parsing
> (I've read that this is one main superiority of shift-reduce
> parsers). An hand-coded recursive descent parser is in my
> opinion very easy to write... and very easy to maintain.

For simple, single-purpose languages, I am inclined to agree that 
hand-written parsers are easy enough to write and maintain.  However, 
when you are dealing with a more complex general-purpose language, and 
one whose grammar needs to evolve over time, then I think a 
parser-generator is a much better solution by far.

The chief trouble in maintaining a hand-written recursive descent parser 
is that the grammar for the input language is hidden inside the 
implementation.  The author of the code can usually pick it out, but 
over time, even the author may find it difficult (and yes, I am speaking 
from a certain degree of first-hand experience in this matter).

In contrast, the grammars consumed by most parser-generators are 
explicitly written out as grammars, in the spirit (of not the form) of 
BNF notation.  If you want to make a change to the grammar (and thereby, 
the parser), you can easily see how the changes will affect the rest of 
the language, and a new parser can be created quickly and easily, simply 
by re-running the parser generator.  The only lines of code you need to 
touch are the places where your change affects the translation, and that 
is often quite minimal.

Furthermore, naive implementation of recursive descent is fraught with a 
few subtle perils to trip the novice and the unwary.  For instance, you 
must carefully avoid left-recursive productions, or your parser may not 
terminate.  Also, error-handling is tricky in recursive descent, because 
much of the parser's state is implicit in stack frames that must be 
correctly unwound when an error forces you to bail out.  If you're 
writing in a language (like Python) with good automatic memory 
management, the latter is less of an issue, but recursive descent 
parsers written in languages without automatic memory management, like C 
and C++, must contend mightily with this.

Of course, there is no silver bullet, but the availability of good LR(1) 
and LALR(1) parser generators should not be discounted, even if the 
theory behind them seems to be slightly complicated, on the surface.

Cheers,
-M

-- 
Michael J. Fromberger             | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/  | Dartmouth College, Hanover, NH, USA