Has anyone implemented BASIC in Python?

Wed Aug 25 00:37:59 EDT 2004

In article <285ni0hmusfgqslj9gnhhimvanr2jh0be3 at 4ax.com>,
 Andrea Griffini <agriff at tin.it> wrote:

> On Mon, 23 Aug 2004 11:10:53 -0400, "Michael J. Fromberger"
> <Michael.J.Fromberger at Clothing.Dartmouth.EDU> wrote:
> 
> >when you are dealing with a more complex general-purpose language, and 
> >one whose grammar needs to evolve over time, then I think a 
> >parser-generator is a much better solution by far.
> 
> Like I said before may be it's me... but I see the grammar definition 
> and the sparse collection of C code snippets required to use yacc & 
> friends harder to understand than a recursive descending parser.

Admittedly, YACC and Bison have a somewhat steep initial learning curve.  
However, in terms of long-term maintainability, that learning curve is 
well worth the trouble for non-trivial language projects.  By combining 
the context-free grammar with associated semantic information (the 
so-called "sparse collection of C code snippets" you mention above), and 
allowing the tool to generate your parser code, you are much more likely 
to get a parser you can actually argue is correct.

Correct code may not be fashionable, but it is certainly much more 
maintainable than incorrect code, or code whose correctness cannot 
easily be determined.  The usual response:  "It's worked fine so far!" 
does not cut much ice when you're writing something important, as I am 
sure you know!

> for me, shift-reduce parser are beyond the ROI point.

Ah, well.  Suit yourself.  :)  In my opinion, that's equivalent to a 
builder saying, "Power tools do not have good return-on-investment over 
doing all our cutting, fitting, and joining using hand tools."  But, 
perhaps if you only ever need to work with simple grammars, that might 
be true for your own needs.

> >The chief trouble in maintaining a hand-written recursive descent parser 
> >is that the grammar for the input language is hidden inside the 
> >implementation.
> 
> Normally the grammar is documented ALSO elsewhere in a very
> human readable form.

Of course -- but how do you know your human-readable documentation has 
any necessary correlation with the actions of the parser?  With a hand 
coded parser, you basically have nothing but the author's hopeful 
assurance.  Automatically generated parsers are generated directly from 
an EBNF analogue of the grammar (e.g., YACC input), which means you can 
be sure the parser matches the grammar, unless there's a bug in the 
parser generator, and the latter is S.E.P.

And, heck, if you don't like bottom-up shift-reduce parsers of the LR(k) 
family, there are even parser-generators that generate top-down 
predictive parsers of various stripes (e.g., ANTLR), that are reputed to 
work well for many common tasks.

> >Also, error-handling is tricky in recursive descent, because 
> >much of the parser's state is implicit in stack frames that must be 
> >correctly unwound when an error forces you to bail out.  
> 
> Here is an excerpt of a parser I wrote about a dozen years ago
> in C. [...]
> 
>       case -RW_IF :
> 	  SkipToken();
> 	  c=ParseExpr();
> 	  if (!ParseError && NextIs(-RW_THEN))
> 	    { t=ParseStat();
> 	      if (!ParseError)
> 		{ if (CurToken==-RW_ELSE)
> 		    { SkipToken(); e=ParseStat(); }
> 		  else
> 		    e=NULL;
> 		  r=NewIf(c,t,e);
> 		}
> 	      else
> 		{ Deref(t); Deref(c); }
> 	    }
> 	  else
> 	    Deref(c);
> 	  break;
> 
> This is the case for parsing IF-THEN-ELSE statement.

If you'll forgive me for being a bit harsh, I find this code practically 
unreadable.  You've got at least three layers of nested error-handling 
here, just to make sure your dereferences all happen in the correct 
order.  Having to explicitly check error codes like this, and manually 
handle memory, is PRECISELY the reason why I argued that error-handling 
is tricky in recursive descent.  If you handed me this code and asked me 
to find a problem in it, I'd probably tear all my hair out.

Does that mean you shouldn't use recursive descent?  Of course not!  The 
technique is clever, and has its place.  But this example could have 
been so much nicer, even just using a table-driven top-down predictive 
parser, rather than an explicitly recursive one.

> Of course this is just my personal view, based on the experience
> I had on the few languages I implemented. For example I never
> dared to write a parser for C++

Yeah.  Well, C++ is pretty awful to parse, no matter WHAT type of 
algorithm you want to use.  I won't dispute that at all. :)

> PS: Ok ok ok... I'll give shift-reduce parser and their tools
>     another look. May be it's just my laziness that is showing up and 
>     my brain that tries to find excuses for justifying it.

I think that, in the long run, you will probably find your efforts well 
rewarded.  I didn't intend to become the advocate for automatic parser 
generators, but I do think that someone just starting out should be 
aware that recursive descent is not always the best solution.

-M

-- 
Michael J. Fromberger             | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/  | Dartmouth College, Hanover, NH, USA