This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Improve %s support for unicode
Type: Stage:
Components: Interpreter Core Versions:
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: effbot Nosy List: effbot, lemburg, nascheme
Priority: normal Keywords: patch

Created on 2005-03-09 01:43 by nascheme, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
unicode_format.txt nascheme, 2005-03-09 01:43
pyobject_text.txt nascheme, 2005-03-10 21:13 version 2 of patch
Messages (8)
msg47901 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-09 01:43
"'%s' % unicode_string" produces a unicode result.  I
think the following code should also return a unicode
string:

class Wrapper:
....def __str__(self):
........return unicode_string
'%s' % Wrapper()

That behavior would make it easier to write library
code that can work with either str objects or unicode
objects.

The fix is pretty simple (see that attached patch). 
Perhaps the PyObject_Text function should be called
_PyObject_Text instead.  Alternatively, if the function
is make public then we should document it and perhaps
also provide a builtin function called 'text' that uses it.


msg47902 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-03-09 10:10
Logged In: YES 
user_id=38388

Nice patch. 

Only nit: PyObject_Text() should check that the result of
tp_str() is indeed either a string or unicode instance
(possibly from a subclass). Otherwise, the function wouldn't
be able to guarantee this feature - which is what it's all
about.
msg47903 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-10 21:12
Logged In: YES 
user_id=35752

Attaching a better patch.  Add a builtin function called
"text".  Change PyObject_Text to check the return types as
suggested by Mark.  Update the documentation and the tests.
msg47904 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-10 21:13
Logged In: YES 
user_id=35752

attempt to attach patch again
msg47905 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-04-20 21:00
Logged In: YES 
user_id=35752

Assigning to effbot for review.  He had mentioned something
about __text__ at one point.
msg47906 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-04-20 21:27
Logged In: YES 
user_id=38388

Looks OK to me; not sure what you mean with __text__ -
__str__ already has taken that role long ago.
msg47907 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-04-20 21:46
Logged In: YES 
user_id=35752

Here's a quote from him:
> I'm beginning to think that we need an extra method
(__text__), that
> can return any kind of string that's compatible with
Python's text model.
>
> (in today's CPython, that's an 8-bit string with ASCII
only, or a Uni-
> code string.  future Python's may support more string
types, at least at
> the C implementation level).
>
> I'm not sure we can change __str__ or __unicode__ without
breaking
> code in really obscure ways (but I'd be happy to be proven
wrong).

My idea is that we can change __str__ without breaking code.
 The reason is that no one should be calling tp_str
directly.  Instead they use PyObject_Str.

I don't know what he meant by "string that's compatible with
Python's text model".  With my change, Python can only deal
with str or unicode instances.  I have no idea how we could
support other string implementations.

I don't want to introduce a text() builtin that calls
__str__ and then later realize that __text__ would be a
useful.  Perhaps this change is big enough to require a PEP.
msg47908 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-08-22 20:57
Logged In: YES 
user_id=35752

Closing in favor of patch 1266570.
History
Date User Action Args
2022-04-11 14:56:10adminsetgithub: 41671
2005-03-09 01:43:16naschemecreate