Python3.3 str() bug?

Stefan Behnel stefan_ml at behnel.de
Fri Nov 9 12:07:51 EST 2012


Helmut Jarausch, 09.11.2012 14:13:
> On Fri, 09 Nov 2012 23:22:04 +1100, Chris Angelico wrote:
>> What you really should be doing is not transforming the whole
>> structure, but explicitly transforming each part inside it. I
>> recommend you stop fighting the language and start thinking about your
>> data as either *bytes* or *characters* and using the appropriate data
>> types (bytes or str) everywhere. You'll then find that it makes
>> perfect sense to explicitly translate (en/decode) from one to another,
>> but it doesn't make sense to encode a list in UTF-8 or decode a
>> dictionary from Latin-1.
>>
>>> This problem has arisen while converting a working Python2 script to Python3.3.
>>> Since Python2 doesn't have bytestrings it just works.
>>
>> Actually it does; it just calls them "str". And there's a Unicode
>> string type, called "unicode", which is (more or less) the thing that
>> Python 3 calls "str".
>>
>> You may be able to do some kind of recursive cast that, in one sweep
>> of your data structure, encodes all str objects into bytes using a
>> given encoding (or the reverse thereof). But I don't think this is the
>> best way to do things.
> 
> Thanks, but in my case the (complex) object is returned via ctypes from the 
> aspell library.
> I still think that a standard function in Python3 which is able to 'stringify'
> objects should take an encoding parameter.

And how would that work? Would it recursively run through all data
structures you pass in or stop at some level or at some type of object?
Would it simply concatenate the substrings (and with what separator?), or
does the chaining depend on the objects found? Should it use the same
separator for everything or different separators for each level of the data
structure? Should it use str() for everything or repr() for some? Is str()
the right thing or are there special objects that need more than just a
call to str(), some kind of further preprocessing?

There are so many ways to do something like this, and it's so straight
forward to do in a given use case, that it's IMHO useless to even think
about adding a "general solution" for this to the stdlib.

Stefan





More information about the Python-list mailing list