multiple JSON documents in one file, change proposal

Chris Angelico rosuav at gmail.com
Fri Nov 30 18:04:20 EST 2018


On Sat, Dec 1, 2018 at 9:46 AM Marko Rauhamaa <marko at pacujo.net> wrote:
>
> Paul Rubin <no.email at nospam.invalid>:
> > Maybe someone can convince me I'm misusing JSON but I often want to
> > write out a file containing multiple records, and it's convenient to
> > use JSON to represent the record data.
> >
> > The obvious way to read a JSON doc from a file is with "json.load(f)"
> > where f is a file handle. Unfortunately, this throws an exception
>
> I have this "multi-JSON" need quite often. In particular, I exchange
> JSON-encoded messages over byte stream connections. There are many ways
> of doing it. Having rejected different options (<URL:
> https://en.wikipedia.org/wiki/JSON_streaming>), I settled with
> terminating each JSON value with an ASCII NUL character, which is
> illegal in JSON proper.

There actually is a way to parse concatenated JSON (where you start
the next object straight after the previous one, possibly with
whitespace). It's not hugely well proclaimed, but it's based on this
function:

https://docs.python.org/3/library/json.html#json.JSONDecoder.raw_decode

Example usage:

data = """\
{"foo": 1, "bar": 2}
{"foo": 3, "bar": 4}
{"foo": 5, "bar": 6}
"""

from json import JSONDecoder

def parse(s, *, decoder=JSONDecoder()):
    obj, offset = decoder.raw_decode(s)
    return obj, s[offset:].strip()

while data:
    obj, data = parse(data)
    print(obj)


> > I also recommend the following article to those not aware of how badly
> > designed JSON is: http://seriot.ch/parsing_json.php

(There's some irony in using a PHP web site to discuss bad design.)

> JSON is not ideal, but compared with XML, it's a godsend.
>
> What would be ideal? I think S-expressions would come close, but people
> can mess up even them: <URL: https://www.ietf.org/rfc/rfc2693.txt>.

Crockford is opinionated and egotistical. And like all humans, often
wrong. JSON is far from a perfect interchange format, and there have
been numerous attempts to improve on it (for instance, SUPPORTING
TRAILING COMMAS!! ARGH!). See, for instance, JSON5 (https://json5.org/
and https://pypi.org/project/json5/), which is I believe a strict
superset of JSON and also a strict subset of JavaScript, keeping it
fully compatible both ways.

JSON isn't perfect, but for many purposes, it is good enough. Where it
isn't, there are often reasonable extensions (such as the use of
raw_decode to read multiple objects from a string) that can help.

ChrisA



More information about the Python-list mailing list