HTMLParser.HTMLParseError: EOF in middle of construct

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Jun 19 00:37:35 EDT 2007


En Mon, 18 Jun 2007 16:38:18 -0300, Sergio Monteiro Basto  
<sergio at sergiomb.no-ip.org> escribió:

> Can someone explain me, what is wrong with this site ?
>
> python linkExtractor3.py http://www.noticiasdeaveiro.pt > test
>
> HTMLParser.HTMLParseError: EOF in middle of construct, at line 1173,
> column 1
>
> at line 1173 of test file is perfectly normal .

That page is not valid HTML - http://validator.w3.org/ finds 726 errors in  
it.
HTMLParser expects valid HTML - try a different tool, like BeautifulSoup,  
which is specially designed to handle malformed pages.

-- 
Gabriel Genellina




More information about the Python-list mailing list