This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Doubled backslash in repr() method for unicode
Type: Stage:
Components: Unicode Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: hyeshik.chang Nosy List: anthonybaxter, cito, hyeshik.chang, nnorwitz
Priority: high Keywords:

Created on 2006-03-27 02:54 by cito, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
uni.diff nnorwitz, 2006-03-27 06:28 patch for test
Messages (9)
msg27885 - (view) Author: Christoph Zwerschke (cito) * Date: 2006-03-27 02:54
Here is an issue that caused Kid templates (used by
Turbogears) to malfunction in Python 2.4.3c1.

The problem shows up with the following code:

class s1:
    def __repr__(self):
        return '\\n'

class s2:
    def __repr__(self):
        return u'\\n'

print repr(s1()), repr(s2())

I get the following results:

Python 2.3.5: \n \n
Python 2.4.2: \n \n
Python 2.4.3c1: \n \\n 

In the output for Python 2.4.3c1, the backslash in the
representation of class2 appears doubled. This did not
happen in earlier Python versions and seems to be a bug.

My vague guess is that the issue may have crept in with
an attempted fix of Bug #1379994.

-- Christoph
msg27886 - (view) Author: Anthony Baxter (anthonybaxter) (Python triager) Date: 2006-03-27 05:53
Logged In: YES 
user_id=29957

Confirmed - it's also broken in the trunk, and backing out
the patch for http://www.python.org/sf/1379994 (r41728)
fixes the problem. Perky, you checked this in - can you look
at this soon, please? I don't want to release 2.4.3 until
it's fixed, but I also want to get 2.4.3 out this week.

Thanks for the bug report!
msg27887 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-03-27 06:28
Logged In: YES 
user_id=33168

Attached a patch for the test case to be added with fix.
msg27888 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2006-03-27 06:38
Logged In: YES 
user_id=55188

Looking the C code, unicode_repr is doing correct.
But the inconsistency came from PyObject_Repr.
This change made it which is intended:

------------------------------------------------------------------------
r16198 | effbot | 2000-07-09 02:43:32 +0900 (일, 09  7 2000)
| 6 lines


- changed __repr__ to use "unicode escape" encoding for unicode
  strings, instead of the default encoding.
  (see "minidom" thread for discussion, and also patch #100706)
msg27889 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2006-03-27 06:56
Logged In: YES 
user_id=55188

Found it!:
http://mail.python.org/pipermail/python-dev/2000-July/005353.html
But their intention had never applied before 2.4.3.
What problem would be if we change PyObject_Repr to use the
default encoding not unicode-escape? (revert r16198)
msg27890 - (view) Author: Anthony Baxter (anthonybaxter) (Python triager) Date: 2006-03-27 06:57
Logged In: YES 
user_id=29957

I'm confused how a checkin from 5+ years ago broke a change
from 3 months ago?

Or am I misunderstanding you?
msg27891 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2006-03-27 07:05
Logged In: YES 
user_id=55188

Because unicode-escape codec didn't escape \,
PyObject_Repr(u'\\') bypassed backslashes.  But Martin and
Fredrik made PyObject_Repr to use unicode-escape codec for
unicode repr-returns 5 years ago.  So by fixing
unicode-escape codec, their intention could be applied for
the first time, 3 months ago.
msg27892 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-03-27 07:09
Logged In: YES 
user_id=33168

We need to retain the old behaviour, but also fix the bug. 
How can we do that?  
msg27893 - (view) Author: Anthony Baxter (anthonybaxter) (Python triager) Date: 2006-03-28 07:39
Logged In: YES 
user_id=29957

Ok. After talking to perky, I reverted the fix for 1379994
on the release24-maint branch, and reverted /F's ancient
change on the trunk. This seemed the best combination of
practicality and purity. Fix will be in 2.4.3 final. 

Thanks for the bug report. Man, unicode and repr is a twisty
ball of horrors.
History
Date User Action Args
2022-04-11 14:56:16adminsetgithub: 43094
2006-03-27 02:54:37citocreate