[Python-Dev] Minidom and Unicode

M.-A. Lemburg mal@lemburg.com
Sat, 01 Jul 2000 18:46:52 +0200


Fredrik Lundh wrote:
> 
> mal wrote:
> > > I'm not sure whose fault that is: either __repr__ should accept
> > > unicode strings, or minidom.Element.__repr__ should be changed to
> > > return a plain string, e.g. by converting tagname to UTF-8. In any
> > > case, I believe __repr__ should 'work' for these objects.
> >
> > Note that __repr__ has to return a string object (and IIRC
> > this is checked in object.c or abstract.c). The correct way
> > to get there is to simply return str(...) or to have a
> > switch on the type of self.tagName and then call .encode().
> 
> assuming that the goal is to get rid of this restriction in future
> versions (a string is a string is a string), how about special-
> casing this in PyObject_Repr:
> 
>         PyObject *res;
>         res = (*v->ob_type->tp_repr)(v);
>         if (res == NULL)
>             return NULL;
> ---
>         if (PyUnicode_Check(res)) {
>             PyObject* str;
>             str = PyUnicode_AsEncodedString(res, NULL, NULL);
>             if (str) {
>                 Py_DECREF(res);
>                 res = str;
>             }
>         }
> ---
>         if (!PyString_Check(res)) {
>             PyErr_Format(PyExc_TypeError,
>                      "__repr__ returned non-string (type %.200s)",
>                      res->ob_type->tp_name);
>             Py_DECREF(res);
>             return NULL;
>         }
>         return res;
> 
> in this way, people can "do the right thing" in their code,
> and have it work better in future versions...
> 
> (just say "+1", and the mad patcher will update the repository)

I'd say +0, since the auto-converion can fail if the default
encoding doesn't have room for the tagName characters.

Either way, I'd still prefer the DOM code to use an explicit
.encode() together with some lossless encoding, e.g.
unicode-escape.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/