Unicode/UTF-8 confusion

Carsten Haese carsten at uniqsys.com
Sun Mar 16 02:54:03 EDT 2008


On Sat, 2008-03-15 at 16:33 -0400, Tom Stambaugh wrote:
> I appreciate the answers the community has provided, I think I need to add 
> some additional context.
> [...]
>     var aSerializedObject = '%(jsonString)s';
> [...]
> Once back in the browser, the loadObject method calls JSON.parse on 
> aSerializedObject, the json string we're discussing.
> [...]
> In order to successfully pass the escapes to the server, I already have to 
> double any each backslash. At the end of the day, it's easier -- and results 
> in better performance -- to convert each apostrophe to its unicode 
> equivalent, as I originally asked.
> [...]

It helps to ask for what you really need instead of asking for what you
think you need. The above helps in that it outlines the source of your
confusion.

What you don't realize is that you're really doing two JSON-encode steps
on the server side, and two JSON-decode steps on the client side. You
have two decode steps because sticking a JSON-string on the right-hand
side of a JavaScript expression will parse that string in the same way a
JSON parser would. That's an implicit JSON-decode, and later you're
explicitly decoding the result of that implicit decode.

You also have two JSON-encode steps. One is an explicit encode step
using simplejson.dumps, and the other is an implicit encode done by a
semi-functional mishmash of double-backslashes and wrapping the string
in apostrophes. As you have discovered, that doesn't work so well when
the string already contains apostrophes.

What I suggest you try is this:
1) Get rid of the apostrophes in '%(jsonString)s'.
2) Get rid of all the manual escaping.
2) Send the result of simplejson.dumps through a second simplejson.dumps
step.

Alternatively, you could try this:
1) Get rid of the apostrophes in '%(jsonString)s'.
2) Get rid of all the manual escaping.
2) Get rid of the JSON.parse step on the browser side.

The first alternative accomplishes that you correctly JSON-encode twice
and correctly JSON-decode twice. The second alternative accomplishes
that you only encode and decode once.

Hope this helps,

-- 
Carsten Haese
http://informixdb.sourceforge.net





More information about the Python-list mailing list