[Python-porting] json encoding

R. David Murray rdmurray at bitdance.com
Sat Jun 20 01:18:00 CEST 2015


On Fri, 19 Jun 2015 14:41:59 -0700, Clay Gerrard <clay.gerrard at gmail.com> wrote:
> Has anyone ported any projects that use json as a data interchange format -
> or can think of any that probably had to port some json handling code?
> 
> I'm perplexed that python3's json.dumps returns a str.  I mean I know it's
> dump*s* - but that was from back when we only had binary strings ;)
> 
> My reading leads me to believe that the json format is not a "string" - its
> a binary format - encoded in utf-8 [1]

Well, utf-8 is the wire encoding, but the *intent* (as I understand
it) is that json is a unicode string.

> Why would want to first create a unicode string - then encode it to utf-8?
> What else are you doing with this this data exchange format if not using it
> to exchange the representation of the data encoded as bytes to somewhere
> else?
> 
>     json_data = json.dumps(my_object)
>     if not isinstance(json_data, six.bytes_type):
>         json_data = json_data.encode('utf-8')
>     my_buffer = BytesIO(json_data)
> 
> ^ Is that really the best I can do?
> 
> Why did json.dumps in python3 [3] loose the encoding kwarg from python2 [2].

Because json is a string.  You encode it to utf-8 using the string
encode method when you put it on the wire.  It's not json if its not
unicode or utf-8 (or -16 or -32), so it wouldn't make sense to have an
encoding keyword that can take an arbitrary codec.

> Thanks for any guidance or suggestion - maybe I'm just thinking about it
> wrong.

I haven't done any code that used json and needed to run on both python2
and python3, so I can't tell you if there is an easier way.  But in general
you do have to jump through some hoops when supporting both dialects.

--David


More information about the Python-porting mailing list