[Tutor] Make beautifulsoup show the data it has an issue with

Kent Johnson kent37 at tds.net
Sun Apr 12 16:47:09 CEST 2009


On Sun, Apr 12, 2009 at 10:21 AM, Sander Sweers <sander.sweers at gmail.com> wrote:
> 2009/4/10 Kent Johnson <kent37 at tds.net>:
>> Or, catch
>> the exception, have the code find out where the error is and display
>> the bad line.
>
> This is what I was looking for. I know how to catch the exception but
> how do I make it display the bad line?

You had:

tsoup = BeautifulSoup(readPage('http://url.sanitized'))

and you got

HTMLParseError: malformed start tag, at line 167, column 73

so try something like this (assuming Python 2.x):

from HTMLParser import HTMLParseError
data = readPage('http://url.sanitized')
try:
  tsoup = BeautifulSoup(data)
except HTMLParseError, ex:
  lines = data.splitlines()
  bad_line = lines[ex.lineno]
  print ex
  print repr(bad_line)  # use repr() so non-printing chars will show

Kent


More information about the Tutor mailing list