"convert" string to bytes without changing data (encoding)

Tim Chase python.list at tim.thechases.com
Wed Mar 28 14:49:19 EDT 2012


On 03/28/12 13:05, Ross Ridge wrote:
> Ross Ridge<rridge at csclub.uwaterloo.ca>  wr=
>> But a Python Unicode string might be stored in several
>> ways; for all you know, it might actually be stored as a sequence of
>> apples in a refrigerator, just as long as they can be referenced
>> correctly.
>
> But it is in fact only stored in one particular way, as a series of bytes.
>
>> There's no logical Python way to turn that into a series of bytes.
>
> Nonsense.  Play all the semantic games you want, it already is a series
> of bytes.

Internally, they're a series of bytes, but they are MEANINGLESS 
bytes unless you know how they are encoded internally.  Those 
bytes could be UTF-8, UTF-16, UTF-32, or any of a number of other 
possible encodings[1].  If you get the internal byte stream, 
there's no way to meaningfully operate on it unless you also know 
how it's encoded (or you're willing to sacrifice the ability to 
reliably get the string back).

-tkc

[1]
http://docs.python.org/library/codecs.html#standard-encodings







More information about the Python-list mailing list