Refactoring in a large code base

Rustom Mody rustompmody at gmail.com
Fri Jan 22 08:30:48 EST 2016


On Friday, January 22, 2016 at 6:05:02 PM UTC+5:30, Chris Angelico wrote:
> On Fri, Jan 22, 2016 at 11:04 PM, Rustom Mody  wrote:
> > 2. My students trying to work inside the lexer made a mess because the extant lexer is a mess.
> > I.e. while python(3) *claims* to accept Unicode input, the actual lexer is
> > an ASCII lexer special-cased for unicode rather than pre-lexing utf8 to unicode
> >
> > These are just specific examples that I am familiar with
> 
> 
> Regarding lexers specifically, I have never seen any full-size
> language parser that I've wanted to tinker with. They're always highly
> optimized pieces of code, dealing with innumerable edge and corner
> cases, and exploring them is always like dipping my toe into something
> that's either ice-cold water or highly caustic acid, and I can't tell
> which.
> 

You just gave a graphic vivid description...
of the same thing Marko is describing: ;-) viz.
A full-size language parser is something that you - an experienced developer -
make a point of avoiding.
So then the question comes down to this: Is this the order of nature?
Or is it man-made disorder?
Jury's out on that one for lexers/parsers specifically.
For arbitrary code in general, the problem that it may be arbitrarily and unboundedly 
complex/complicated is the oldest problem in computer science: the halting problem.

IOW anyone who thinks that *arbitrary* complexity can *always* be tamed either
has a visa to utopia or needs to re-evaluate (or get) a CS degree



More information about the Python-list mailing list