multiple JSON documents in one file, change proposal

Marko Rauhamaa marko at pacujo.net
Sat Dec 1 05:10:50 EST 2018


Paul Rubin <no.email at nospam.invalid>:

> Marko Rauhamaa <marko at pacujo.net> writes:
>> Having rejected different options (<URL:
>> https://en.wikipedia.org/wiki/JSON_streaming>), I settled with
>> terminating each JSON value with an ASCII NUL character, which is
>> illegal in JSON proper.
>
> Thanks, that Wikipedia article is helpful.  I'd prefer to not use stuff
> like NUL or RS because I like keeping the file human readable.  I might
> use netstring format (http://cr.yp.to/proto/netstrings.txt) but I'm even
> more convinced now that adding a streaming feature to the existing json
> module is the right way to do it.

We all have our preferences.

In my case, I need an explicit terminator marker to know when a JSON
value is complete. For example, if I should read from a socket:

   123

I can't yet parse it because there might be another digit coming. On the
other hand, the peer might not see any reason to send any further bytes
because "123" is all they wanted to send at the moment.

As for NUL, a control character that is illegal in all JSON contexts is
practical so the JSON chunks don't need to be escaped. An ASCII-esque
solution would be to pick ETX (= end of text). Unfortunately, a human
operator typing ETX (= ctrl-C) to terminate a JSON value will cause a
KeyboardInterrupt in many modern command-line interfaces.

It happens NUL (= ctrl-SPC = ctrl-@) is pretty easy to generate and
manipulate in editors and the command line.

The need for the format to be "typable" (and editable) is essential for
ad-hoc manual testing of components. That precludes all framing formats
that would necessitate a length prefix. HTTP would be horrible to have
to type even without the content-length problem, but BEEP (RFC 3080)
would suffer from the content-length (and CRLF!) issue as well.

Finally, couldn't any whitespace character work as a terminator? Yes, it
could, but it would force you to use a special JSON parser that is
prepared to handle the self-delineation. A NUL gives you many more
degrees of freedom in choosing your JSON tools.


Marko



More information about the Python-list mailing list