Browsers

Andrew M. Kuchling akuchlin at mems-exchange.org
Tue Jun 1 16:13:12 EDT 1999


>Daniel Faulkner <m01ymu00 at cwcom.net> wrote:
>> Is there a basic browser some where that I can look at to see how it
>> works? (not grail)
>> As I can't understand much of the python internet software and don't
>> understand how to parse the HTML once I've got it.

G. David Kuhlman writes:
>Lynx is a text mode browser:
>    http://lynx.browser.org/
>For fancier stuff, look at Mozilla:
>      http://www.mozilla.org/

       Note, however, that an HTML parser capable of coping with all
the invalid HTML on the Web is a complicated beast.  For example, Lynx
currently has an SGMLish style parser that has been brain damaged in
various ways to cope with invalid HTML.  I don't know how much error
correction Grail includes, but it might actually be a simpler parser
if it hasn't been complicated with various error recovery hacks.
Another good option might be to look at the test code in htmllib.py,,
which does simple HTML-to-text formatting.  (When trying to figure out
a module, always look in the module's code first, since authors will
often include simple examples or test scripts inside an 'if
__name__=='__main__'" block.

-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Time, place, and action may with pains be wrought, / But Genius must be born;
and can never be taught.
    -- John Congreve





More information about the Python-list mailing list