[Patches] [ python-Patches-403685 ] Printing unicode

Wed, 20 Feb 2002 00:57:40 -0800

Patches item #403685, was opened at 2001-02-08 06:37
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403685&group_id=5470

Category: Core (C code)
Group: None
Status: Closed
>Resolution: Out of Date
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: M.-A. Lemburg (lemburg)
Summary: Printing unicode

Initial Comment:
The print statement *always* passes the to-be-printed objects through str() before passing them onto the file.write().

This is a problem for unicode. It is not possible to print unicode strings to a unicode-aware file.

This patch allows a stream to inhibit this automatic call to str() by defining a false __str__before_write__ attribute. Such an attribute has been added to the streams in the codecs.py

I hope the documentation patch is correct; Im not able to test it.

(added by Toby Dickenson, sourceforge id 'htrd', who seems to have a broken https)

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-02-20 00:57

Message:
Logged In: YES 
user_id=21627

Superceded by patch #462849.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-02-10 06:23

Message:
Postponed for discussion in Python 2.2 cycle as per request by Guido.

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-02-09 05:45

Message:
I'd rather break some code here and then get this done right once
and for all.  I wouldn't want to carry along a special attribute
which needs to be checked before every .write() operation. This
costs performance and adds unnecessary convolution to the file 
API. Since Unicode is still very new, I doubt that the impact of
this will cause people too much trouble.

IMHO, the correct way to deal with this is to let the
file object write methods deal with the problem in an application
specific way. 

cStringIO.c (and all other file-like objects)  should be fixed to 
use the s# parser markers instead of requiring a real string 
object. This will also enhance interoperability with other data storage types.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-02-09 04:54

Message:
waaah - without https I cant upload a new version of the patch.

The original patch fails to clear the exception if getattr('__str__before_write__') fails... that inner part of PyFile_WriteObject needs to be:

		stringize = PyObject_GetAttrString(f, "__str_before_write__");
		if(!stringize) PyErr_Clear();
		if(!stringize || PyObject_IsTrue(stringize))
			value = PyObject_Str(v);
		else {
			Py_INCREF(v);
			value = v;
		}

my appologies for the extra trouble.

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2001-02-09 04:02

Message:
I hadnt seen that original discussion, and cant find it in the archives now :-(

However, your summary matches exactly what this patch achieves for files have __str_before_write__=0

I think we have to require this extra flag to enable the new behaviour. For example, this prevents breakage of old code that prints to a StringIO instance.

(Toby Dickenson, tdickenson@geminidataloggers.com)

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-02-08 08:02

Message:
I don't remember the details, but there was a discussion about this
problem on python-dev. The outcome was to let Unicode objects
pass through as-is to the file object and then have it apply
whatever conversion it takes.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=403685&group_id=5470