[XML-SIG] Strings or Unicode ?

Juergen Hermann Juergen Hermann" <jh@web.de
Fri, 09 Nov 2001 19:35:26 +0100


On Fri, 9 Nov 2001 17:55:39 +0100, Martin v. Loewis wrote:

>It probably is. Following the Python convention, I'd suggest to use
>byte strings only in the ASCII case, and convert non-ASCII Latin-1 to
>Unicode. It will be simpler that way *if* you have Latin-1 element
>names, since ASCII autoconverts, whereas full Latin-1 doesn't.

I do this:

PirxxObject PirxxBuildStringOrUnicode(const XMLCh* xmlstr)
{
    PirxxObject result =3D PirxxBuildUnicode(xmlstr);
    if (result) {
        PyObject* latin1 =3D PyUnicode_AsLatin1String(result);
        if (latin1) {
            result =3D latin1;
            Py_DECREF(latin1);
        } else {
            PyErr_Clear();
        }
    }

    return result;
}

In a nutshell this means "return unicode unless it's convertible to 
latin-1, then return a latin1-encoded bytestring". I wonder if that is 
exactly what you said above, or only very similar.


Ciao, J=FCrgen