Where is the ucs-32 codec?

Erik Max Francis max at alcyone.com
Sun Jun 4 18:42:51 EDT 2006


beni.cherniavsky at gmail.com wrote:

> Python seems to be missing a UCS-32 codec, even in wide builds (not
> that it the build should matter).
> Is there some deep reason or should I just contribute a patch?
> 
> If it's just a bug, should I call the codec 'ucs-32' or 'utf-32'?  Or
> both (aliased)?
> There should be  '-le' and '-be' variats, I suppose.  Should there be a
> variant without explicit endianity, using a BOM to decide (like
> 'utf-16')?
> And it should combine surrogates into valid characters (on all builds),
> like the 'utf-8' codec does, right?

Note that UTF-32 is UCS-4.  UCS-32 ("Universial Character Set in 32 
octets") wouldn't make much sense.

Not that Python has a UCS-4 encoding available either.  I'm really not 
sure why.

-- 
Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
   Could it be / That we need loving to survive
   -- Neneh Cherry



More information about the Python-list mailing list