What to use for finding as many syntax errors as possible.

Thomas Passin list1 at tompassin.net
Tue Oct 11 17:09:27 EDT 2022


On 10/11/2022 4:00 PM, Chris Angelico wrote:
> On Wed, 12 Oct 2022 at 05:23, Thomas Passin <list1 at tompassin.net> wrote:
>>
>> On 10/11/2022 3:10 AM, avi.e.gross at gmail.com wrote:
>>> I see resemblances to something like how a web page is loaded and operated.
>>> I mean very different but at some level not so much.
>>>
>>> I mean a typical web page is read in as HTML with various keyword regions
>>> expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
>>> often cleanly nested in others. The browser makes nodes galore in some kind
>>> of tree format with an assortment of objects whose attributes or methods
>>> represent aspects of what it sees. The resulting treelike structure has
>>> names like DOM.
>>
>> To bring things back to the context of the original post, actual web
>> browsers are extremely tolerant of HTML syntax errors (including
>> incorrect nesting of tags) in the documents they receive.  They usually
>> recover silently from errors and are able to display the rest of the
>> page.  Usually they manage this correctly.
> 
> Having had to debug tiny errors in HTML pages that resulted in
> extremely weird behaviour, I'm not sure that I agree that they usually
> manage correctly. Fundamentally, they guess, and guesswork is never
> reliable.

Still, browsers generally do a very decent job of recovery, even though 
perfection isn't possible.  The OP wants to get help with problems in 
his files even if it isn't perfect, and I think that's reasonable to 
wish for.  The link to a post about the lezer parser in a recent message 
on this thread is partly about how a real, practical parser can do some 
error correction in mid-flight, for the purposes of a programming editor 
(as opposed to one that has to build a correct program).



More information about the Python-list mailing list