[Python-Dev] Unicode-like objects

Guido van Rossum guido@python.org
Tue, 04 Feb 2003 20:39:01 -0500


> Unless I misunderstand, it's currently impossible to create an object
> that behaves just like a unicode object unless the backing store is
> compatible with a "real" unicode object (in which case the buffer
> interface can be used).  [...]

Can you explain what you mean by "behaves just like a unicode object"?

I'd think that's the crux of the matter.  Which operations do you want
to behave in a certain way that currently doesn't work?

E.g. are you interested in what happens when a real unicode object and a
pseudo-unicode object are combined in a binary operation?  Or when you
pass a pseudo-unicode object as an argument to a method of a real
unicode object?  Or when a pseudo-unicode is passed to some standard
function that expects a unicode object?

There are hundreds if not thousands (maybe even tens! :-) of places in
the Python source code where unicode objects are special-cased.  This
may be a practical burden against implementing what you want.

Maybe a refactoring of the unicode object can be considered that
allows what you want, but given the amount of delicate code (including
3rd party code like the Japanese codecs) that is involved, I doubt
that this will be easy...

--Guido van Rossum (home page: http://www.python.org/~guido/)