[Python-Dev] Python 2.0 beta 2 pre-release

M.-A. Lemburg mal@lemburg.com
Wed, 27 Sep 2000 14:20:30 +0200


"M.-A. Lemburg" wrote:
> 
> Fredrik Lundh wrote:
> >
> > tim wrote:
> > > > test test_unicodedata failed -- Writing:
> > > > 'e052289ecef97fc89c794cf663cb74a64631d34e', expected:
> > > > 'b88684df19fca8c3d0ab31f040dd8de89f7836fe'
> > >
> > > The problem appears to be that the test uses the secret "unicode-internal"
> > > encoding, which is dependent upon the big/little-endianess of your
> > platform.
> >
> > my fault -- when I saw that, I asked myself "why the heck doesn't mal
> > just use repr, like I did?" and decided that he used "unicode-escape"
> > was make to sure the test didn't break if the repr encoding changed.
> >
> > too bad my brain didn't trust my eyes...
> 
> repr() would have been a bad choice since the past has shown
> that repr() does change. I completely forgot about the endianness
> which affects the hash value.
> 
> > > I can reproduce your flawed hash exactly on my platform by replacing this
> > > line:
> > >
> > >         h.update(u''.join(data).encode('unicode-internal'))
> >
> > I suggest replacing "unicode-internal" with "utf-8" (which is as canonical
> > as
> > anything can be...)
> 
> I think UTF-8 will bring about problems with surrogates (that's
> why I used the unicode-internal codec). I haven't checked this
> though... I'll fix this ASAP.

UTF-8 works for me. I'll check in a patch.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/