[Python-Dev] Revised PEP 349: Allow str() to return unicode strings

Wolfgang Lipp paragate at gmx.net
Tue Aug 23 08:59:28 EDT 2005



just tested the proposed implementation on a unicode-naive module
basically using

import sys	
import __builtin__
reload( sys ); sys.setdefaultencoding( 'utf-8' )
__builtin__.__dict__[ 'str' ] = new_str_function

et voilà, str() calls in the module are rewritten, and
print u'düsseldorf' does work as expected(*) (even on
systems where i have no access to sitecustomize, like
at my python-friendly isp's servers).

---
* my expectation is that unicode strings do print out
   as utf-8, as i can't see any better solution.

i suggest to make this option available e.g. via a module in
the standard lib to ease transition for people in case the pep
doesn't make it. it may be applied where deemed necessary and
left ignored otherwise.

if nobody thinks the reload hack is too awful and this solution
stands testing, i guess i'll post it to the aspn cookbook. after
all these countless hours of hunting down ordinal not in range,
finally i'm starting to see some light in the issue.

_wolf



On Tue, 23 Aug 2005 12:39:03 +0200, M.-A. Lemburg <mal at egenix.com> wrote:

> Thomas Heller wrote:
>> Neil Schemenauer <nas at arctrix.com> writes:
>>
>>
>>> [Please mail followups to python-dev at python.org.]
>>>
>>> The PEP has been rewritten based on a suggestion by Guido to change
>>> str() rather than adding a new built-in function.  Based on my
>>> testing, I believe the idea is feasible.  It would be helpful if
>>> people could test the patched Python with their own applications and
>>> report any incompatibilities.
>>>
>>
>>
>> I like the fact that currently unicode(x) is guarateed to return a
>> unicode instance, or raises a UnicodeDecodeError.  Same for str(x),
>> which is guaranteed to return a (byte) string instance or raise an
>> error.
>>
>> Wouldn't also a new function make the intent clearer?
>>
>> So I think I'm +1 on the text() built-in, and -0 on changing str.
>
> Same here.
>
> A new API would also help make the transition easier from the
> current mixed data/text type (strings) to data-only (bytes)
> and text-only (text, renamed from unicode) in Py3.0.
>



-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/



More information about the Python-list mailing list