HTMLParser.HTMLParseError: EOF in middle of construct

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Tue Jun 19 11:12:49 EDT 2007


In <4677e790$0$2840$a729d347 at news.telepac.pt>, none wrote:

> Gabriel Genellina wrote:
>> En Mon, 18 Jun 2007 16:38:18 -0300, Sergio Monteiro Basto 
>> <sergio at sergiomb.no-ip.org> escribió:
>> 
>>> Can someone explain me, what is wrong with this site ?
>>>
>>> python linkExtractor3.py http://www.noticiasdeaveiro.pt > test
>>>
>>> HTMLParser.HTMLParseError: EOF in middle of construct, at line 1173,
>>> column 1
>>>
>>> at line 1173 of test file is perfectly normal .
>> 
>> That page is not valid HTML - http://validator.w3.org/ finds 726 errors 
>> in it.
> 
> ok but my problem is not understand what is the specific problem at line 
> 1173

You can't just look at that line and ignore the rest.  There are 604 (!)
errors, some about table rows, before this line.  So the parser may be
confused at this point and be already in an internal state that sees that
line in a completely different light than you do.

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list