[Python-ideas] RFC: bytestring as a str representation [was: a new bytestring type?]

Ethan Furman ethan at stoneleaf.us
Wed Jan 8 02:19:38 CET 2014


On 01/07/2014 04:39 PM, Steven D'Aprano wrote:
> On Tue, Jan 07, 2014 at 08:48:05AM -0800, Ethan Furman wrote:
>
>> [...] My binary stream is mixed:
>>
>>    - binary that has to be converted (4-byte ints, for example)
>>    - ascii that has to be converted (ints stored as ascii text)
>>    - encoded text (character and memo fields)
>
> Ethan, you keep referring to ascii text and encoded text as if they are
> different things. They're not.

Would you feel better if I called them ASCII-encoded text, and other-encoded text?  And they are different, if for no 
other reason than they are using different encodings.  Further, the ASCII-encoded text can be directly compared with 
byte sequences because . . . they're bytes! ;)

>  You have a binary file containing bytes.
> Some of those bytes represent data of one kind (say, 4-bit ints). Some
> of those bytes represent data of a different kind (Latin-1 encoded text
> representing character and memo fields) and other bytes represent data
> of a third kind (ASCII encoded text representing ints, but you don't
> mention what the meaning of those ints is).

ASCII-encoded text reprenting ints are ints.  I don't know what they mean, but presumably they have something to do with 
whatever the user named the field.  For example, I would imagine that b'35' in an AGE field meant 35 years; luckily I 
only have to give the user back the integer 35, not figure out what it's supposed to mean.


> ASCII or Latin-1, the text is still encoded into bytes, and still needs
> to be decoded back to text.

No, it doesn't.  I don't need to convert b'35' into u'35' to convert to 35.  I don't need to convert b'N' to u'N' to 
know I have a Numeric field, nor b'T' to u'T' to get True.

--
~Ethan~


More information about the Python-ideas mailing list