What to use for finding as many syntax errors as possible.

Peter J. Holzer hjp-python at hjp.at
Wed Oct 12 20:14:23 EDT 2022


On 2022-10-11 09:47:52 +1100, Chris Angelico wrote:
> On Tue, 11 Oct 2022 at 09:18, Cameron Simpson <cs at cskk.id.au> wrote:
> >
> Consider:
> 
> if condition # no colon
>     code
> else:
>     code
> 
> To actually "restart" parsing, you have to make a guess of some sort.

Right. At least one of the papers on parsing I read over the last few
years (yeah, I really should try to find them again) argued that the
vast majority of syntax errors is either a missing token, a superfluous
token or a combination of the the two. So one strategy with good results
is to heuristically try to insert or delete single tokens and check
which results in the longest distance to the next error.

Checking multiple possible fixes has its cost, especially since you have
to do that at every error. So you can argue that it is better for
productivity if you discover one error in 0.1 seconds than 10 errors in
5 seconds.


> > I grew up with C and Pascal compilers which would _happily_ produce many
> > complaints, usually accurate, and all manner of syntactic errors. They
> > didn't stop at the first syntax error.
> 
> Yes, because they work with a much simpler grammar.

I very much doubt that. Python doesn't have a particularly complicated
grammar, and C certainly doesn't have a particularly simple one.

The argument that it's impossible in Python (unlike any other language),
because Python is oh so special doesn't hold water.

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp at hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/python-list/attachments/20221013/7d0205ae/attachment.sig>


More information about the Python-list mailing list