HTML extraction

Dieter Maurer dieter at handshake.de
Thu Dec 9 12:04:15 EST 2021


Pieter van Oostrum wrote at 2021-12-8 11:00 +0100:
> ...
>bs4 can do it, but lxml wants correct XML.

Use `lxml's the `HTMLParser` to parse HTML
(--> "see https://lxml.de/parsing.html#parsing-html").


More information about the Python-list mailing list