[Python-Dev] Bug in json (the format and the module)

Bob Ippolito bob at redivi.com
Tue May 17 20:18:15 CEST 2011


By default the json module already escapes anything outside of 7-bit
ASCII, so unless you're using ensure_ascii=False then this is a
non-issue.

I implemented a workaround for ensure_ascii=False in simplejson here,
it would be pretty trivial to add this feature to the json module as
well:
https://github.com/simplejson/simplejson/commit/4989e693bab39b1ce5cf6fc0b21dbacd108c312c

On Tue, May 17, 2011 at 11:40 AM, Jeremy Dunck <jdunck at gmail.com> wrote:
> This blog post describes a bug in a common usage pattern of JSON:
>
> http://timelessrepo.com/json-isnt-a-javascript-subset
>
> That is, there are some characters which are legal in JSON
> serializations, but not in JavaScript strings.
>
> This works OK for JSON parsers, but a common use case of JSON is
> JSONP, where the result of a request is presumed to be executable
> javascript:
>
> <script src="http://someapi.com/jsonp?callback=foo"> might return a response:
>
> foo({"some_json":"which might or might not be legal javascript"})
>
> The post also suggests a solution -- to replace literal U+2028 - Line
> separator and U+2029 - Paragraph separator with their escape sequences
> \u2028 and \u2029.
>
> This is a nice solution in that it makes the JSON valid JS while
> keeping the same semantics.  Of course there's the annoyance of
> processing the full string, comparable in overhead to utf-8 encoding,
> I presume.
>
> So, to start with, is there a maintainer for the json module, or how
> should I go about discussing implementing this solution?
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/bob%40redivi.com
>


More information about the Python-Dev mailing list