Parsing html

Leif K-Brooks eurleif at ecritters.biz
Thu Jul 8 15:37:05 EDT 2004


C Gillespie wrote:
> I have hopefully a very simple problem. I wish to parse an html page and
> extract everything between the <body> tags.

People are actually suggesting using DOM for this?! A simple approach is 
much better:

def get_body(html):
	body_start = html.find('<body')
	body_end = html.find('</body>', body_start) + 7
	return html[body_start:body_end]



More information about the Python-list mailing list