BeautifulSoup vs. Microsoft
Duncan Booth
duncan.booth at invalid.invalid
Thu Mar 29 07:52:21 EDT 2007
"Justin Ezequiel" <justin.mailinglists at gmail.com> wrote:
> On Mar 29, 4:08 pm, Duncan Booth <duncan.bo... at invalid.invalid> wrote:
>> John Nagle <n... at animats.com> wrote:
>> > title="<!--http://www.microsoft.com/usability/information.mspx->"
>>
>> > is supposed to be an HTML comment. But it's improperly terminated.
>>
>> It is an attribute value, and unescaped angle brackets are valid in
>> attributes. It looks to me like a bug in BeautifulSoup.
>
> FWIW, see http://tinyurl.com/yjtzjz
>
> new fan of BeautifulSoup here as it helped me parse "BAD" XML
> (although my client would disagree with that description)
>
I'm right behind BeautifulSoup's ability to parse bad HTML, but I still
think it should give priority to being able to parse valid HTML withough
messing it up.
More information about the Python-list
mailing list