printing indented html code

Lowell Kirsh lkirsh at cs.ubc.ca
Fri Jun 24 07:18:12 EDT 2005


Looks good. I'll give it a try.

Konstantin Veretennicov wrote:
> On 6/24/05, Lowell Kirsh <lkirsh at cs.ubc.ca> wrote:
> 
>>Is there a module or library anyone knows of that will print html code
>>indented?
> 
> 
> Depends on whether you have to deal with xhtml, valid html or just
> any, possibly invalid, "pseudo-html" that abounds on the net.
> 
> Here's an example of (too) simple html processing using standard
> HTMLParser module:
> 
> from HTMLParser import HTMLParser
> 
> class Indenter(HTMLParser):
>     
>     def __init__(self, out):
>         HTMLParser.__init__(self)
>         self.out = out
>         self.indent_level = 0
>     
>     def write_line(self, text):
>         print >> self.out, '\t' * self.indent_level + text
>     
>     def handle_starttag(self, tag, attrs):
>         self.write_line(
>             '<%s %s>' % (tag, ' '.join('%s=%s' % (k, v) for k, v in attrs))
>             )
>         self.indent_level += 1
>     
>     def handle_endtag(self, tag):
>         self.indent_level -= 1
>         self.write_line('</%s>' % tag)
>     
>     handle_data = write_line
>     
>     # handle everything else...
>     # http://docs.python.org/lib/module-HTMLParser.html
> 
> if __name__ == '__main__':
>     import sys
>     i = Indenter(sys.stdout)
>     i.feed('<html><head>foobar</head><body color=0>body</body></html>')
>     i.close()
> 
> - kv



More information about the Python-list mailing list