HTMLParser.HTMLParseError: EOF in middle of construct

Sérgio Monteiro Basto sergio at sergiomb.no-ip.org
Tue Jun 19 21:09:32 EDT 2007


Stefan Behnel wrote:

> Sérgio Monteiro Basto wrote:
>> but is one single error that blocks this.
>> Finally I found it , it is :
>> <td colspan="2"align="center"
>> if I put :
>> <td colspan="2" align="center"
>>
>> p = re.compile('"align')
>> content = p.sub('" align', content)
>> 
>> I can parse the html
>> I don't know if it a bug of HTMLParser
> 
> Sure, and next time your key doesn't open your neighbours house, please
> report to the building company to have them fix the door.
> 

The question, here, is if 
<td colspan="2"align="center" 
is valid HTML or not ?
I think is valid , if so it's a bug on HTMLParser 
if not, we still have a very bad message error (EOF in middle of
construct !?) 

I have to use HTMLParser because I want use only python 2.4 standard , I
have to install the scripts in many machines.
And I have to parse many different sites, I just want extract the links, so
with a clean up before parse solve very quickly my problem.

Thanks,
--
Sérgio M. B. 
 



More information about the Python-list mailing list