"convert" string to bytes without changing data (encoding)

MRAB python at mrabarnett.plus.com
Wed Mar 28 15:50:01 EDT 2012


On 28/03/2012 20:02, Prasad, Ramit wrote:
>>  >The right way to convert bytes to strings, and vice versa, is via
>>  >encoding and decoding operations.
>>
>>  If you want to dictate to the original poster the correct way to do
>>  things then you don't need to do anything more that.  You don't need to
>>  pretend like Chris Angelico that there's isn't a direct mapping from
>>  the his Python 3 implementation's internal respresentation of strings
>>  to bytes in order to label what he's asking for as being "silly".
>
> It might be technically possible to recreate internal implementation,
> or get the byte data. That does not mean it will make any sense or
> be understood in a meaningful manner. I think Ian summarized it
> very well:
>
>>You can't generally just "deal with the ascii portions" without
>>knowing something about the encoding.  Say you encounter a byte
>>greater than 127.  Is it a single non-ASCII character, or is it the
>>leading byte of a multi-byte character?  If the next character is less
>>than 127, is it an ASCII character, or a continuation of the previous
>>character?  For UTF-8 you could safely assume ASCII, but without
>>knowing the encoding, there is no way to be sure.  If you just assume
>>it's ASCII and manipulate it as such, you could be messing up
>>non-ASCII characters.
>
> Technically, ASCII goes up to 256 but they are not A-z letters.
>
Technically, ASCII is 7-bit, so it goes up to 127.



More information about the Python-list mailing list