Packing a simple dictionary into a string - extending struct?

Jonathan Fine jfine at pytex.org
Fri Jun 22 03:08:53 EDT 2007


Jonathan Fine wrote:

> Thank you for this suggestion.  The growing adoption of JSON in Ajax
> programming is a strong argument for my using it in my application, although
> I think I'd prefer something a little more binary.
> 
> So it looks like I'll be using JSON.

Well, I tried.  But I came across two problems (see below).

First, there's bloat.  For binary byte data, one average one
character becomes just over 4.

Second, there's the inconvenience.  I can't simple take a
sequence of bytes and encode them using JSON.  I have to
turn them into Unicode first.  And I guess there's a similar
problem at the other end.

So I'm going with me own solution: 
http://mathtran.cvs.sourceforge.net/mathtran/py/bytedict.py?revision=1.1&view=markup

It seems to be related to cerializer:
http://home.gna.org/oomadness/en/cerealizer/index.html

It seems to me that JSON works well for Unicode text, but not
with binary data.  Indeed, Unicode hides the binary form of
the stored data, presenting only the code points.  But I don't
have Unicode strings!

Here's my test script, which is why I'm not using JSON:
===
import simplejson

x = u''
for i in range(256):
     x += unichr(i)

print len(simplejson.dumps(x)), '\n'

simplejson.dumps(chr(128))
===

Here's the output
===
1046  # 256 bytes => 256 * 4 + 34 bytes

Traceback (most recent call last):
  <snip>
   File "/usr/lib/python2.4/encodings/utf_8.py", line 16, in decode
     return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: 
unexpected code byte
===

-- 
Jonathan




More information about the Python-list mailing list