[I18n-sig] Changing case
M.-A. Lemburg
mal@lemburg.com
Wed, 12 Apr 2000 11:30:40 +0200
[CCing to i18n too]
Andy Robinson wrote:
>
> > To make all this work without too many hassles we'd need
> > (at least the most commonly used) CJKV codecs in the core
> > distribution. How big would these be ? Would someone contribute
> > them... Tamito ?
> >
> He may be at home by now, but he indicated to me that he was
> happy for them to be used in any way. The nice things about
> his codecs are
> (a) one could extract the mapping tables for other codecs
> from data at www.unicode org and use a very similar
> approach.
> (b) the mappings may be 168k, but they at least zip nicely.
> I'm guessing at 5-6 such codecs in the distribution
> initially.
> (c) the algorithmic bit can be accelerated later in C or our
> vaporware state machine, and nobody needs to change
> any interfaces.
> (d) if we slightly parameterise his codecs so that one could
> substitute a different mapping table if needed, then
> all the corporate variations just need to create a
> new dictionary with the deltas - Microsoft Code Page
> 932 would not be another 168k, but just a few k and
> could build its mapping on the fly.
Sounds ok to me.
> However, I suspect putting it in the core for June 1st may
> be too aggressive; if the compiler is going to use them on
> every source file for a Japanese user, we really want to
> move from byte-level loops in Python to something much faster.
Speed is not an issue now: what we need is a good concept
and some proof-of-concept code to go with it.
BTW, all this will go into 1.7 AFAIK... 1.6 will have to do
with what's there now. I may get a patch done for the -e
command line switch -- but only as experimental feature
in 1.6.
Unfortunately, Guido's out at the moment, so he can't
comment on this...
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/