[Expat-discuss] Always reports utf-8 encoding?

Franky Braem franky.braem at gmail.com
Wed Sep 27 23:02:13 CEST 2006


I've compiled expat with XML_UNICODE to get UTF-16 encoding. But it 
seems that the character data handler always gets its information in UTF-8.
The xml-file is stored in UTF-16 format.

This is what I do in the handler:

void ModulesXMLParser::CharacterDataHandler(void *userData,
                                            const XML_Char *s,
                                            int len)
{
    ModulesXMLParser *modxml = (ModulesXMLParser *) userData;
    for(int i = 0; i < len; i++)
    {
        const unsigned t = s[i];
        modxml->m_chars.AppendByte(t);
    }
    //modxml->m_chars.AppendData((void *) s, len);
}

And this is how I convert the information stored in m_chars:

    wxMBConvUTF16 conv;
    modxml->m_chars.AppendByte('\0');
    modxml->m_chars.AppendByte('\0');
    wxString dllName = wxString((const char *) 
modxml->m_chars.GetData(), conv);

The above doesn't work. The following works:

    wxString dllName = wxString((const char *) 
modxml->m_chars.GetData(), wxConvUTF8);

Any ideas on how to get UTF-16 output?

Franky.


More information about the Expat-discuss mailing list