encode and decode builtins

Ned Batchelder ned at nedbatchelder.com
Sun Nov 16 08:43:49 EST 2014


On 11/16/14 2:39 AM, Garrett Berg wrote:
> I made the switch to python 3 about two months ago, and I have to say I
> love everything about it, /especially/ the change to using only bytes
> and str (no more unicode! or... everything is unicode!) As someone who
> works with embedded devices, it is great to know what data I am working
> with.

I am glad that you are excited about Python 3.  But I'm a little 
surprised to hear your characterization of the changes it brought.  Both 
Python 2 and Python 3 are the same in that they have two types for 
representing strings: one for byte strings, and one for Unicode strings.

The difference is that Python 2 called them str and unicode, with "" 
being a byte string; Python 3 calls them bytes and str, with "" being a 
unicode string.  Also, Python 2 happily converted between them 
implicitly, while Python 3 does not.

>
> However, there are times that I do not care what data I am working with,
> and I find myself writing something like:
>
>     if isinstance(data, bytes): data = data.decode()
>

This goes against a fundamental tenet of both Python 2 and 3: you should 
know what data you have, and deal with it properly.

> This is tedious and breaks the pythonic method of not caring about what
> your input is. If I expect that my input can always be decoded into
> valid data, then why do I have to write this?
>
> Instead, I would like to propose to add *encode* and *decode* as
> builtins. I have written simple code to demonstrate my desire:
>
> https://gist.github.com/cloudformdesign/d8065a32cdd76d1b3230

If you find these functions useful, by all means use them in your code. 
  BTW: looks to me like you have infinite recursion on lines 9 and 20, 
so that must be a simple oversight.

>
> There may be a few edge cases I am missing, which would all the more
> prove my point -- we need a function like this!

You are free to have a function like that.  Getting them added to the 
standard library is extremely unlikely.

>
> Basically, if I expect my data to be a string I can just write:
>
>     data = decode(data)
>
> ​Which would accomplish two goals: explicitly stating what I expect of
> my data, and doing so concisely and cleanly.
>
>
>


-- 
Ned Batchelder, http://nedbatchelder.com




More information about the Python-list mailing list