[OT] is JSON all that great? - was Re: API Help

Chris Angelico rosuav at gmail.com
Thu Jun 15 08:27:40 EDT 2017


On Thu, Jun 15, 2017 at 9:47 PM, Rhodri James <rhodri at kynesim.co.uk> wrote:
>> 1) It is not secure. Check this out:
>> https://stackoverflow.com/questions/1906927/xml-vulnerabilities#1907500
> XML and JSON share the vulnerabilities that come from having to parse
> untrusted external input.  XML then has some extra since it has extra
> flexibility, like being able to specify external resources (potential attack
> vectors) or entity substitution.  If you don't need the extra flexibility,
> feel free to use JSON, but don't for one moment think that makes you
> inherently safe.

Not sure what you mean about parsing untrusted external input. Suppose
you build a web server that receives POST data formatted either JSON
or XML. You take a puddle of bytes, and then proceed to decode them.
Let's say you also decree that it can't be more than 1MB of content
(if it's more than that, you reject it without parsing). Okay. What
vulnerabilities are there in JSON? You could have half a million open
brackets followed by half a million close brackets, but that quickly
terminates Python's parser with a recursion trap. You can segfault
Python if you sys.setrecursionlimit() too high, but that's your fault,
not JSON's. Within the 1MB limit, this is the most memory I can
imagine using:

>>> data = b"[" + b"{}," * 349524 + b"{}]"

That expands to 349525 empty objects, represented in Python with
dictionaries, at 288 bytes apiece (using the Python 3.5 size, before
the new compact representation cut that to 240 bytes). Add in the
surrounding list, all 3012904 bytes of it, and the original 1MB input
has expanded to 103,676,104 bytes. That's a hundred-to-one expansion -
significant, but hardly the worst thing an attacker can do. In the SO
link above, a demo is given where a 200KB XML payload expands to >2GB,
for a more than 10K-to-one expansion. "Inherently safe"? At very
least, far FAR safer. Then there are two XML attacks involving
external resource access. JSON fundamentally cannot do that, ergo you
are inherently safe. And the final attack involves recursion. JSON
also fundamentally cannot represent any form of recursion.

Winner: JSON, with 3.5 points to XML's 0.5, and that's being generous
enough to give each of them half a point for the payload expansion
attack.

Got any other attacks against JSON? Bear in mind, you have to attack
the format itself, not a buggy parser implementation (which can be
corrected in a bugfix release without hassles).

ChrisA



More information about the Python-list mailing list