codecs limitation

Thu Feb 19 04:45:28 EST 2004

On Wed, 18 Feb 2004, A.M. Kuchling wrote:

AK> > I have the same question as stated in comments: should we
AK> really
AK> > enforce this and forget the idea to define some
AK> specialized
AK> > encodings like 'html'?
AK>
AK> I suppose it depends on what the codecs system is *for*.  If
AK> it's an
AK> interface that goes between between the abstract world of
AK> Unicode code
AK> points and the concrete world of 8-bit characters that
AK> represent those code
AK> points, then the idea of returning anything but an 8-bit
AK> string from
AK> .encode() doesn't make sense.  If codecs are for arbitrary

This restrition doesn't apply to decode, so we already have
codecs like 'base64', 'quoted-printable', 'uu', 'zlib'.  In case
of 'html' encoding we should be able to apply it to unicode too,
in this case the result must be unicode.  I can't see why this
restriction is partially applied and allows str<->str cobversion,
but forbidds unicode<->unicode?  Certainly, both 'base64' et al.
and 'html' can be implemented as standalone functions.

AK> string-to-string
AK> transformations, then the restriction should be relaxed.

-- 
Denis S. Otkidach
http://www.python.ru/      [ru]