[Python-Dev] OT: A Day in the Life of p5p

Andrew Kuchling akuchlin@mems-exchange.org
Wed, 28 Jun 2000 16:10:28 -0400


On Wed, Jun 28, 2000 at 10:53:16AM -0700, Paul Prescod wrote:
>Note that the document doesn't yet cover the regular expression engine
>or the "PerlInterpreter". 

The regex engine's pretty hard to read, mostly because comments are
infrequent and not very helpful, and disentangling it from the rest of
Perl would require a skilled wizard.  (PCRE, if slower, is at least
much clearer and easier to understand, though the compile() function
is pretty ugly.)  A while ago I saw a p5p post from Ilya Zakharevich
who did most of the recent regex hacking; he draw attention to one
flag variable in the code and said basically "I don't know what this
flag means; I think it's some sort of UTF-8 setting, but Larry didn't
explain it."

>I can't think of a disclaimer that doesn't sound like it is tongue in
>cheek but I do feel bad about beating up on a design which, in its own
>way, has a certain kind of quality (just not one I happen to prefer).

Agreed; it could be made much simpler, but maybe at a performance
cost.  (Though performance is tricky, and maybe the extra work costs
more than it saves.)

For example, note the flag bits in SvNULL, which have values like
GMAGICAL.  You could imagine a Python implementation that added flag
bits to every object, and set a bit if there was a __getattr__ method
defined; code could then do 'if (obj->flags & GMAGICAL) ...'  instead
of the more complicated 'if (PyObject_HasAttrString(obj,
"__getattr__")'.  It would be interesting to know if Topaz, Chip
Salzenberg's experimental C++ implementation, preserves this
complexity or aims to cut it away.  The use of several levels of C
structs is also reminiscent of the way you do OO in C, as in X
toolkits.

You can also see the importance of text processing in the SvPVBM type,
for attaching a Boyer-Moore related table to a string and speeding up
regex searches.  

--amk