[Python-Dev] Re: [Bug #121013] Bug in <stringobject>.join(<unicodestring>)

M.-A. Lemburg mal@lemburg.com
Mon, 27 Nov 2000 22:08:41 +0100


Michael Hudson wrote:
> 
> "M.-A. Lemburg" <mal@lemburg.com> writes:
> 
> > > Date: 2000-Nov-27 10:12
> > > By: mwh
> > >
> > > Comment:
> > > I hope you're all suitably embarrassed - please see patch #102548 for the trivial fix...
> >
> > Hehe, that was indeed a trivial patch. What was that about trees
> > in a forest...
> 
> The way I found it was perhaps instructive.  I was looking at the
> function, and thought "that's a bit complicated" so I rewrote it (My
> rewrite also seems to be bit quicker so I'll upload it as soon as make
> test has finished[*]).  In the course of rewriting it, I saw the line
> my patch touched and went "duh!".

Yeah. The bug must have sneaked in there when the function was
updated to the PySequence_Fast_* implementation.

BTW, could you also add a patch for the test_string.py and
test_unicode.py tests ?

> > I still think that the PySequence_Fast_GETITEM() macro should at
> > least include a fall-back type check which causes some exception in
> > case the used object was not "fixed" using PySequence_Fast() (which
> > I only noticed in string_join() just now).
> 
> It's hard to see how; you're not going to check each invocation of
> PySequence_Fast_GETITEM for a NULL return, are you?  It's possible
> that PySequence_Fast should raise an exception on being passed a
> string or Unicode object... but probably not.

Since not using PySequence_Fast() to initialize the protocol,
I'd suggest doing a Py_FatalError() with some explanatory
text which gets printed to stderr -- still better than a
segfault at some later point due to some dangling pointers...

> > Fredrik's PySequence_Fast_* APIs look interesting, BTW. Should be
> > used more often :-)
> 
> Yes.  But they're pretty new, aren't they? 

Yep. Fredrik added them quite late in the 2.0 release process.

> I find them a bit
> unsatisfactory that it's not possible to hoist the type check out of
> the inner loop.  Still, it seems my PII's branch predictor nails that
> one... (i.e. changing it so that it didn't check inside the loop made
> naff-all difference to the running time).

I think Fredrik speculated on having the compiler optimizers
taking care of the check... hmm, it would probably also help
to somehow declare PyTypeObject slots "const" -- is this possible
in a struct definition ?
 
> Cheers,
> M.
> 
> [*] which reminds me: test_unicodedata is failing for me at the
>     moment.  Anyone else seeing this?  It looks like a python
>     regrtest.py -g is all that's needed...

Is that for the CVS version or the release version ?
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/