[Python-ideas] RFC: bytestring as a str representation [was: a new bytestring type?]
Steven D'Aprano
steve at pearwood.info
Wed Jan 8 01:39:11 CET 2014
On Tue, Jan 07, 2014 at 08:48:05AM -0800, Ethan Furman wrote:
> [...] My binary stream is mixed:
>
> - binary that has to be converted (4-byte ints, for example)
> - ascii that has to be converted (ints stored as ascii text)
> - encoded text (character and memo fields)
Ethan, you keep referring to ascii text and encoded text as if they are
different things. They're not. You have a binary file containing bytes.
Some of those bytes represent data of one kind (say, 4-bit ints). Some
of those bytes represent data of a different kind (Latin-1 encoded text
representing character and memo fields) and other bytes represent data
of a third kind (ASCII encoded text representing ints, but you don't
mention what the meaning of those ints is).
ASCII or Latin-1, the text is still encoded into bytes, and still needs
to be decoded back to text. Since Latin-1 is a superset of ASCII, you
could use Latin-1 for them all, and still get the same result.
Of course you can't just decode the entire file into Latin-1, since
parts of it represent non-text data, but you could decode all the text
parts individually using Latin-1 and/or ASCII.
(To those reading and wondering how I know the character and memo fields
use Latin-1, Ethan has discussed this case on comp.lang.python.)
--
Steven
More information about the Python-ideas
mailing list