[Python-Dev] Minidom and Unicode

Guido van Rossum guido@beopen.com
Sat, 01 Jul 2000 09:13:36 -0500


> mal wrote:
> > > I'm not sure whose fault that is: either __repr__ should accept
> > > unicode strings, or minidom.Element.__repr__ should be changed to
> > > return a plain string, e.g. by converting tagname to UTF-8. In any
> > > case, I believe __repr__ should 'work' for these objects.
> > 
> > Note that __repr__ has to return a string object (and IIRC
> > this is checked in object.c or abstract.c). The correct way
> > to get there is to simply return str(...) or to have a
> > switch on the type of self.tagName and then call .encode().

[/F]
> assuming that the goal is to get rid of this restriction in future
> versions (a string is a string is a string), how about special-
> casing this in PyObject_Repr:
> 
>         PyObject *res;
>         res = (*v->ob_type->tp_repr)(v);
>         if (res == NULL)
>             return NULL;
> ---
>         if (PyUnicode_Check(res)) {
>             PyObject* str;
>             str = PyUnicode_AsEncodedString(res, NULL, NULL);
>             if (str) {
>                 Py_DECREF(res);
>                 res = str;
>             }
>         }
> ---
>         if (!PyString_Check(res)) {
>             PyErr_Format(PyExc_TypeError,
>                      "__repr__ returned non-string (type %.200s)",
>                      res->ob_type->tp_name);
>             Py_DECREF(res);
>             return NULL;
>         }
>         return res;
> 
> in this way, people can "do the right thing" in their code,
> and have it work better in future versions...
> 
> (just say "+1", and the mad patcher will update the repository)

+1

--Guido van Rossum (home page: http://dinsdale.python.org/~guido/)