"convert" string to bytes without changing data (encoding)

Wed Mar 28 20:11:56 EDT 2012

On 3/28/2012 14:43, Ross Ridge wrote:
> Evan Driscoll  <driscoll at cs.wisc.edu> wrote:
>> So yes, you can say that pretending there's not a mapping of strings to 
>> internal representation is silly, because there is. However, there's 
>> nothing you can say about that mapping.
> 
> I'm not the one labeling anything as being silly.  I'm the one labeling
> the things as bullshit, and that's what you're doing here.  I can in
> fact say what the internal byte string representation of strings is any
> given build of Python 3.  Just because I can't say what it would be in
> an imaginary hypothetical implementation doesn't mean I can never say
> anything about it.

People like you -- who write to assumptions which are not even remotely
guaranteed by the spec -- are part of the reason software sucks.

People like you hold back progress, because system implementers aren't
free to make changes without breaking backwards compatibility. Enormous
amounts of effort are expended to test programs and diagnose problems
which are caused by unwarranted assumptions like "the encoding of a
string is UTF-8". In the worst case, assumptions like that lead to
security fixes that don't go as far as they could, like the recent
discussion about hashing.

Python is definitely closer to the "willing to break backwards
compatibility to improve" end of the spectrum than some other projects
(*cough* Windows *cough*), but that still doesn't mean that you can make
assumptions like that.

This email is a bit harsher than it deserves -- but I feel not by much.

Evan