[Python-Dev] Re: [Bug #121013] Bug in
<stringobject>.join(<unicodestring>)
M.-A. Lemburg
mal@lemburg.com
Tue, 28 Nov 2000 10:17:29 +0100
Michael Hudson wrote:
>
> "M.-A. Lemburg" <mal@lemburg.com> writes:
>
> > Michael Hudson wrote:
> > >
> > > "M.-A. Lemburg" <mal@lemburg.com> writes:
> > >
> > > > > Date: 2000-Nov-27 10:12
> > > > > By: mwh
> > > > >
> > > > > Comment:
> > > > > I hope you're all suitably embarrassed - please see patch #102548 for the trivial fix...
> > > >
> > > > Hehe, that was indeed a trivial patch. What was that about trees
> > > > in a forest...
> > >
> > > The way I found it was perhaps instructive. I was looking at the
> > > function, and thought "that's a bit complicated" so I rewrote it (My
> > > rewrite also seems to be bit quicker so I'll upload it as soon as make
> > > test has finished[*]). In the course of rewriting it, I saw the line
> > > my patch touched and went "duh!".
> >
> > Yeah. The bug must have sneaked in there when the function was
> > updated to the PySequence_Fast_* implementation.
> >
> > BTW, could you also add a patch for the test_string.py and
> > test_unicode.py tests ?
>
> Here's an effort, but it's a memory scribbling bug. If I change
> test_unicode.py thus:
>
> Index: test_unicode.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/test/test_unicode.py,v
> retrieving revision 1.22
> diff -c -r1.22 test_unicode.py
> *** test_unicode.py 2000/10/23 17:22:08 1.22
> --- test_unicode.py 2000/11/28 08:49:11
> ***************
> *** 70,76 ****
>
> # join now works with any sequence type
> class Sequence:
> ! def __init__(self): self.seq = 'wxyz'
> def __len__(self): return len(self.seq)
> def __getitem__(self, i): return self.seq[i]
>
> --- 70,76 ----
>
> # join now works with any sequence type
> class Sequence:
> ! def __init__(self): self.seq = [u'w',u'x',u'y',u'z']
> def __len__(self): return len(self.seq)
> def __getitem__(self, i): return self.seq[i]
>
> ***************
> *** 78,83 ****
> --- 78,87 ----
> test('join', u'', u'abcd', (u'a', u'b', u'c', u'd'))
> test('join', u' ', u'w x y z', Sequence())
> test('join', u' ', TypeError, 7)
> + test('join', ' ', u'a b c d', [u'a', u'b', u'c', u'd'])
> + test('join', '', u'abcd', (u'a', u'b', u'c', u'd'))
> + test('join', ' ', u'w x y z', Sequence())
> + test('join', ' ', TypeError, 7)
>
> class BadSeq(Sequence):
> def __init__(self): self.seq = [7, u'hello', 123L]
>
> and back out the fix for the join bug, this happens:
>
> ...
> ...
> Testing Unicode formatting strings... done.
> Testing builtin codecs...
> Traceback (most recent call last):
> File "test_unicode.py", line 378, in ?
> assert unicode('hello','utf8') == u'hello'
> File "/usr/local/src/python/dist/build/Lib/encodings/__init__.py", line 30, in
> ?
> import codecs,aliases
> SystemError: compile.c:185: bad argument to internal function
> Segmentation fault
>
> i.e. it crashes miles away from the problem.
The test is only supposed to assure that we don't trip again. It's
not intended to work in some way *before* applying your patch.
I always try to integrate tests for bugs into the test suites
for my mx stuff and AFAICTL this also seems to be the Python
dev style.
--
Marc-Andre Lemburg
______________________________________________________________________
Company: http://www.egenix.com/
Consulting: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/