"convert" string to bytes without changing data (encoding)

Heiko Wundram modelnine at modelnine.org
Wed Mar 28 06:42:43 EDT 2012


Am 28.03.2012 11:43, schrieb Peter Daum:
> ... in my example, the variable s points to a "string", i.e. a series 
> of
> bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters.

No; a string contains a series of codepoints from the unicode plane, 
representing natural language characters (at least in the simplistic 
view, I'm not talking about surrogates). These can be encoded to 
different binary storage representations, of which ascii is (a common) 
one.

> What I am looking for is a general way to just copy the raw data
> from a "string" object to a "byte" object without any attempt to
> "decode" or "encode" anything ...

There is "logically" no raw data in the string, just a series of 
codepoints, as stated above. You'll have to specify the encoding to use 
to get at "raw" data, and from what I gather you're interested in the 
latin-1 (or iso-8859-15) encoding, as you're specifically referencing 
chars >= 0x80 (which hints at your mindset being in LATIN-land, so to 
speak).

-- 
--- Heiko.



More information about the Python-list mailing list