[Python-ideas] .from and .to instead of .encode and .decode

Mon Mar 16 14:09:36 CET 2015

On Sun, Mar 8, 2015 at 4:07 PM, Giampaolo Rodola' <g.rodola at gmail.com> wrote:
> On Sat, Mar 7, 2015 at 2:42 PM, Luciano Ramalho <luciano at ramalho.org> wrote:
>>
>> On Sat, Mar 7, 2015 at 8:41 AM, Chris Angelico <rosuav at gmail.com> wrote:
>> > If it says "decode", the result is a Unicode string. If it says
>> > "encode", the result is bytes. I'm not sure what is difficult here.
>>
>> Yep. When I teach, I use this mnemonic, which I can now quote from my
>> book [1] ;-)
>>
>> [TIP]
>> ====
>> If you need a memory aid to distinguish `.decode()` from `.encode()`,
>> convince yourself that a Unicode `str` contains "human" text, while
>> byte sequences can be cryptic machine core dumps. Therefore, it makes
>> sense that we *decode* `bytes` to `str` to get human readable text,
>> and we *encode* text to `bytes` for storage or transmission.
>> ====
>
>
> This is a great advice (and yes, I also often get confused by the two).

I use the analogy of "encoding abstract numbers into specific bytes" myself.
It is still hard to review the source, which may use it wrong.

db.get(query).from('utf-8') gives me a clear indication that a person expects
that db contents is in 'utf-8'.

db.get(query).decode('utf-8') gives me a feeling that query result should be
decoded from whatever format it is in to 'utf-8'.

So if Python supported .from and .to, the part about encoding wouldn't feel
like math.