Python 3.5, bytes, and %-interpolation (aka PEP 461)

Marko Rauhamaa marko at pacujo.net
Mon Feb 24 17:18:53 EST 2014


random832 at fastmail.us:

> On Mon, Feb 24, 2014, at 15:46, Marko Rauhamaa wrote:
>> That is:
>> 
>>  1. ineffient (encode/decode shuffle)
>> 
>>  2. unnatural (strings usually have no place in protocols)
>
> That's not at all clear. Why _aren't_ these protocols considered text
> protocols? Why can't you add a string directly to headers?

Text expresses a written human language. In prosaic terms, a Python
string is a sequence of ISO 10646 characters, whose codepoints are not
octets.

Most network protocols are defined in terms of octets, although many of
them can carry textual, audio or video payloads (among others). So when
RFC 3507 (ICAP) shows an example starting:

   RESPMOD icap://icap.example.org/satisf ICAP/1.0
   Host: icap.example.org
   Encapsulated: req-hdr=0, res-hdr=137, res-body=296

it consists of 8-bit octets and not some human language.

In practical terms, you get the bytes off the socket as, well, bytes. It
makes little sense to "decode" those bytes into a string for
manipulation. Manipulating bytes directly is both more efficient and
more natural from the point of view of the standard.

Many internet protocols happen to look like text. It makes it nicer for
human network programmers to work with them. However, they are primarily
meant for computers, and the message formats are really a form of binary
code.


Marko



More information about the Python-list mailing list