How use XML parsing tools on this one specific URL?

skip at pobox.com skip at pobox.com
Sun Mar 4 12:52:46 EST 2007


    Chris> http://moneycentral.msn.com/companyreport?Symbol=BBBY

    Chris> I can't validate it and xml.minidom.dom.parseString won't work on
    Chris> it.

    Chris> If this was just some teenager's web site I'd move on.  Is there
    Chris> any hope avoiding regular expression hacks to extract the data
    Chris> from this page?

Tidy it perhaps or use BeautifulSoup?  ElementTree can use tidy if it's
available.

Skip



More information about the Python-list mailing list