Parsing html

Leif K-Brooks eurleif at ecritters.biz
Thu Jul 8 15:37:05 EDT 2004

Previous message (by thread): Parsing html
Next message (by thread): Parsing html
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

C Gillespie wrote:
> I have hopefully a very simple problem. I wish to parse an html page and
> extract everything between the <body> tags.

People are actually suggesting using DOM for this?! A simple approach is 
much better:

def get_body(html):
	body_start = html.find('<body')
	body_end = html.find('</body>', body_start) + 7
	return html[body_start:body_end]

Previous message (by thread): Parsing html
Next message (by thread): Parsing html
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-list mailing list