[Python-Dev] bytes / unicode

Terry Reedy tjreedy at udel.edu
Mon Jun 21 19:45:20 CEST 2010


On 6/21/2010 8:51 AM, Nick Coghlan wrote:

>
> I don't know that the "all is well" camp actually exists. The camp
> that I do see existing is the one that says "without a bug report,
> inconsistencies in the standard library's unicode handling won't get
> fixed".
>
> The issues picked up by the regression test suite have already been
> dealt with, but that suite is unfortunately far from comprehensive.
> Just like a lot of Python code that is out there, the standard library
> isn't immune to the poor coding practices that were permitted by the
> blurry lines between text and octet streams in 2.x.
>
> It may be that there are places where we need to rewrite standard
> library algorithms to be bytes/str neutral (e.g. by using length one
> slices instead of indexing). It may be that there are more APIs that
> need to grow "encoding" keyword arguments that they then pass on to
> the functions they call or use to convert str arguments to bytes (or
> vice-versa). But without people trying to port affected libraries and
> reporting bugs when they find issues, the situation isn't going to
> improve.
>
> Now, if these bugs are already being reported against 3.1 and just
> aren't getting fixed, that's a completely different story...

Some of the above have been, over a year ago. See, for instance,
http://bugs.python.org/issue5468
I am getting the impression that the people who use the web modules 
tend, like me, to not have the tools to write and test patches . So they 
can squeak but not grease.

Terry Jan Reedy



More information about the Python-Dev mailing list