[Python-Dev] Unicode 5.1.0

Guido van Rossum guido at python.org
Fri Aug 22 18:12:55 CEST 2008


2008/8/22 Fredrik Lundh <fredrik at pythonware.com>:
> On Fri, Aug 22, 2008 at 4:59 PM, Guido van Rossum <guido at python.org>
wrote:
>
>>> (how's the 3.2/4.1 dual support implemented?  do we have two distinct
>>> datasets, or are the differences encoded in some clever way?  would it
>>> make sense to split the unicodedata module into three separate
>>> modules, one for each major Unicode version?)
>>
>> The current API looks fine to me: unicodedata is the latest version
>> whereas unicodedata.ucd_3_2_0 is the older version. The APIs are the
>> same; there's a tiny bit of code in the generated _db.h file that
>> expresses the differences:
>>
>> static const change_record* get_change_3_2_0(Py_UCS4 n)
>> {
>>        int index;
>>        if (n >= 0x110000) index = 0;
>>        else {
>>                index = changes_3_2_0_index[n>>7];
>>                index = changes_3_2_0_data[(index<<7)+(n & 127)];
>>        }
>>        return change_records_3_2_0+index;
>> }
>
> there's a bunch of data tables as well, but they don't seem to be very
> large.  looks like Martin did a thorough job here.
>
> ... digging digging digging ...
>
> yes, the generator script produces difference tables between the main
> version and a list of older versions.  I'd say it's worth running the
> script on the 5.1.0 tables, and if it doesn't choke, compare the
> resulting table with the corresponding table for 4.1.0 (a simple loop
> fetching the main properties for all code points).  if the differences
> look reasonably small, switch 5.1.0 and keep the others.

Right, that's my hope as well. I believe the changes between 3.2 and 4.1
were much larger than more recent changes. (Yay convergence! :-)

> I can tinker a little with this over the weekend, unless Martin tells
> me not to ;-)

That would be great!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20080822/b9223317/attachment.htm>


More information about the Python-Dev mailing list