[Python-Dev] Internationalization Toolkit

Andrew M. Kuchling akuchlin@mems-exchange.org
Tue, 9 Nov 1999 12:40:07 -0500 (EST)


Guido van Rossum writes:
>It's from scratch, and I believe it's got Perl style, not POSIX style
>semantics -- per Tim Peters' recommendations.  Do we need to open the
>discussion again?

No, no; I'm actually happier with Perl-style, because it's far better
documented and familiar to people. Worse *is* better, after all.

My concern is simply that I've started translating re.py into C, and
wonder how this affects the translation.  This isn't a pressing issue,
because the C version isn't finished yet.

>It involves a redone re module (supporting Unicode as well as 8-bit),
>but its API could be unchanged.  /F does the parsing and compilation
>in Python, only the matching engine is in C -- not sure how that
>impacts performance, but I imagine with aggressive caching it would be
>okay.

Can I get my paws on a copy of the modified re.py to see what
ramifications it has, or is this all still an unreleased
work-in-progress?

Doing the compilation in Python is a good idea, and will make it
possible to implement alternative syntaxes.  I would have liked to
make it possible to generate PCRE bytecodes from Python, but what
stopped me is the chance of bogus bytecode causing the engine to dump
core, loop forever, or some other nastiness.  (This is particularly
important for code that uses rexec.py, because you'd expect regexes to
be safe.)  Fixing the engine to be stable when faced with bad
bytecodes appears to require many additional checks that would slow
down the common case of correct code, which is unappealing.


-- 
A.M. Kuchling			http://starship.python.net/crew/amk/
Anybody else on the list got an opinion? Should I change the language or not?
    -- Guido van Rossum, 28 Dec 91