[IronPython] Django, __unicode__, and #20366
Dino Viehland
dinov at microsoft.com
Thu Feb 11 20:11:33 CET 2010
Oh, I should also add I think Michael's proposed fix makes this code work
correctly.
> -----Original Message-----
> From: users-bounces at lists.ironpython.com [mailto:users-
> bounces at lists.ironpython.com] On Behalf Of Dino Viehland
> Sent: Thursday, February 11, 2010 11:07 AM
> To: Discussion of IronPython; Michael Foord
> Subject: Re: [IronPython] Django, __unicode__, and #20366
>
> Vernon wrote:
> > You need the 'byte' class for Python 3 anyway. Implement it now.
>
> Done! Assuming you mean bytes it’s in 2.6 already. Now if everyone
> would upgrade their code to use b’’ :)
>
> > A small sample...
> >
> > <code x.py>
> > import sys
> > u = u'1234\u00f6'
> > s = '1234'
> > x = str(s)
> > print type(x), repr(x)
> > x = unicode(s)
> > print type(x), repr(x)
> > try:
> > x = unicode(u)
> > print type(x), repr(x)
> > except:
> > print 'Error=',sys.exc_info()[0]
> > try:
> > x = str(u)
> > print type(x), repr(x)
> > except:
> > print 'Error=',sys.exc_info()[0]
> > </code>
> > --------------------
> >
> > The results...
> >
> > >c:\python26\python.exe x.py
> > <type 'str'> '1234'
> > <type 'unicode'> u'1234'
> > <type 'unicode'> u'1234\xf6'
> > Error= <type 'exceptions.UnicodeEncodeError'>
> >
> > >"c:\program files\Ironpython 2.6\ipy.exe" x.py
> > <type 'str'> '1234'
> > <type 'str'> '1234'
> > Error= <type 'exceptions.UnicodeDecodeError'>
> > Error= <type 'exceptions.UnicodeDecodeError'>
> >
> > >copy x.py x3.py
> > >2to3 -w x3.py
> > >c:\python31\python.exe x3.py
> > <class 'str'> '1234'
> > <class 'str'> '1234'
> > <class 'str'> '1234ö'
> > <class 'str'> '1234ö'
> > ------------------------------
> > One would think that IronPython should produce the same output as
> Python 3 -- since 'str' and 'unicode' are the same thing in both
> dialects. In particular, the exception when 'converting' unicode to >
> unicode is just plain wrong.
>
>
> I'm not going to argue the exception isn't wrong. But saying
> IronPython should output the same thing as an entirely different script
> isn't right either. After running 2to3 the script looks like this for
> me:
>
> import sys
> u = '1234\u00f6'
> s = '1234'
> x = str(s)
> print(type(x), repr(x))
> x = str(s)
> print(type(x), repr(x))
> try:
> x = str(u)
> print(type(x), repr(x))
> except:
> print('Error=',sys.exc_info()[0])
> try:
> x = str(u)
> print(type(x), repr(x))
> except:
> print('Error=',sys.exc_info()[0])
>
> You can argue whether or not 2to3 did the right thing here - it has
> completely dropped the distinction between str and unicode. In reality
> if this was a script written for Python 2.5 and above your usage of str
> here is ambiguous. If this script was written for 2.6 and above then
> it's clear you want strings and not bytes because you'd have used
> bytes/bytearray/b'' to indicate bytes. The problem is there's still
> lots of code which runs on 2.5+ and won't be using bytes/bytearray/b''
> but really is dealing with bytes and not strings.
>
> The fact is this is going to be broken unless we were to make str be a
> distinct type from Unicode - then there'd be no ambiguity and we
> wouldn't have to guess. But that's a massive change which propagates
> through the entire IronPython code base and involves tons of breaking
> changes. I've looked at doing this before and it's spreads everywhere
> and there's lots of new ugliness. We could look at doing it again but
> it seems like making that massive change and then switching to 3k and
> changing it all back isn't very productive.
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.ironpython.com
> http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
More information about the Ironpython-users
mailing list