What to use for finding as many syntax errors as possible.

Cameron Simpson cs at cskk.id.au
Mon Oct 10 18:17:13 EDT 2022


On 11Oct2022 08:02, Chris Angelico <rosuav at gmail.com> wrote:
>There's a huge difference between non-fatal errors and syntactic
>errors. The OP wants the parser to magically skip over a fundamental
>syntactic error and still parse everything else correctly. That's
>never going to work perfectly, and the OP is surprised at this.

The OP is not surprised by this, and explicitly expressed awareness that 
resuming a parse had potential for "misparsing" further code.

I remain of the opinion that one could resume a parse at the next 
unindented line and get reasonable results a lot of the time.

In fact, I expect that one could resume tokenising at almost any line 
which didn't seem to be inside a string and often get reasonable 
results.

I grew up with C and Pascal compilers which would _happily_ produce many 
complaints, usually accurate, and all manner of syntactic errors. They 
didn't stop at the first syntax error.

All you need in principle is a parser which goes "report syntax error 
here, continue assuming <some state>". For Python that might mean 
"pretend a missing final colon" or "close open brackets" etc, depending 
on the context. If you make conservative implied corrections you can get 
a reasonable continued parse, enough to find further syntax errors.

I remember the Pascal compiler in particular had a really good "you 
missed a semicolon _back there_" mode which was almost always correct, a 
nice boon when correcting mistakes.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list