[Python-Dev] Dropping bytes "support" in json

Dirkjan Ochtman dirkjan at ochtman.nl
Thu Apr 9 09:59:56 CEST 2009


On Thu, Apr 9, 2009 at 07:15, Antoine Pitrou <solipsis at pitrou.net> wrote:
> The RFC also specifies a discrimination algorithm for non-supersets of ASCII
> (“Since the first two characters of a JSON text will always be ASCII
>   characters [RFC0020], it is possible to determine whether an octet
>   stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
>   at the pattern of nulls in the first four octets.”), but it is not
> implemented in the json module:

Well, your example is bad in the context of the RFC. The RFC states
that JSON-text = object / array, meaning "loads" for '"hi"' isn't
strictly valid. The discrimination algorithm obviously only works in
the context of that grammar, where the first character of a document
must be { or [ and the next character can only be {, [, f, n, t, ", -,
a number, or insignificant whitespace (space, \t, \r, \n).

>>>> json.loads('"hi"')
> 'hi'
>>>> json.loads(u'"hi"'.encode('utf16'))
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "/home/antoine/cpython/__svn__/Lib/json/__init__.py", line 310, in loads
>    return _default_decoder.decode(s)
>  File "/home/antoine/cpython/__svn__/Lib/json/decoder.py", line 344, in decode
>    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>  File "/home/antoine/cpython/__svn__/Lib/json/decoder.py", line 362, in raw_decode
>    raise ValueError("No JSON object could be decoded")
> ValueError: No JSON object could be decoded

Cheers,

Dirkjan


More information about the Python-Dev mailing list