[Python-Dev] Problems with the new unicodectype.c
Fredrik Lundh
Fredrik Lundh" <effbot@telia.com
Tue, 11 Jul 2000 23:16:42 +0200
tim wrote:
=20
> I believe we also need a way to split unicodedatabase.c into multiple =
files,
> as > 64K lines in a source file is unreasonable (Python can't handle a
> source file that large either, and Python is the *definition* of
> reasonableness here <wink>), and the MS compiler spits out a warning =
about
> its sheer size.
just a heads-up: I've been hacking a little on a new unicode
database. the results this far are quite promising:
CTYPE: is*, to* functions
118k =3D> 13k
CNAME: code <=3D> name mappings (\N{name})
440k =3D> 160k
CINFO: remaining unicode properties
590k =3D> 42k
(approximate code size with the old and new code, on Windows)
on the source side, 3300k source files are replaced with
about 600k (generated by a script, directly from the uni-
code.txt data file).
note that the CNAME and CINFO parts are optional; you only
need CTYPE to build a working Python interpreter.
integrating this with 2.0 should be relatively straightforward,
but don't expect it to happen before next week or so...
cheers /F