[Expat-discuss] Heh guys and gals ...

Fred Drake fdrake at acm.org
Fri Mar 5 05:12:37 CET 2010


On Thu, Mar 4, 2010 at 7:13 AM, Madden, Paul <Paul.Madden at six-group.com> wrote:
> I am processing an XHTML document with expat. All goes fine til I hit an "&nbsp;" entity. The expat terminates the parse with error "undefined entity".

This is expected.

The nbsp entity is defined in the XHTML document type, and is not
defined by the XML specification.  If you're not parsing the XHTML
document type, this can't be parsed.

If you control the input data, you could use a reference to Unicode
character itself instead of the HTML-centric entity: "&#xA0;" would be
appropriate markup.

Alternately, you could register handler to parse external entities
(including the XHTML DTD) if references are provided from the
document.  The XML_UseForeignDTD API can be used to load a DTD if the
document doesn't include an explicit reference to a DTD.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller


More information about the Expat-discuss mailing list