[Tutor] How can I see properly my korean.

Kent Johnson kent37 at tds.net
Thu May 31 12:38:16 CEST 2007


Young-gyu Park wrote:

>             fileHandle = open (
>     '/var/chroot/www/htdocs/django/js/model.js', 'w' )
>             fileHandle.write( codecs.BOM_UTF8 )
>             print >> fileHandle, 'var blog = '
>             print >> fileHandle, blog
>             fileHandle.close()
> 
> 
> this is the file model.js
>  
> 
>     var blog =
>     {'description': '\xec\xb9\xb4\xed\x86\xa8\xeb\xa6\xad
....
>     <http://www.hideout.com.br>', 'title': '\xed\x9b\x84\xec\x9b\x90'}]} 
> 
>  
> What I want to do is to see properly the letter not this letter '\xec\x9d'
>  
> Can anyone who know solution let me know how to do kindly?

You haven't shown us enough code. Where does the variable blog come from?

This is a hard question to answer because there are so many ways to get 
confused. How did you display the file? It is possible that it contains 
the correct characters but the method you are using to display them 
shows them as \x escapes. For example the Python interpreter will do this.

It looks like you are using a JSON encoder to create the data. Which 
one? Here is an example using the version of SimpleJSON that is bundled 
with Django. It does what you want but it's a little tricky to be sure:

In [3]: from django.utils.simplejson import dumps

This is Python so I can use \x escapes to define the string; the actual 
string is UTF-8:

In [4]: data = {'description': '\xec\xb9\xb4\xed\x86\xa8\xeb\xa6\xad 
\xed\x91\xb8\xeb\xa6\x84\xed\x84\xb0'}

If I ask the interpreter for the value directly, it shows it with 
escapes. (Technically, the interpreter prints repr(value) for any value 
it is asked to display; for strings, repr() inserts \x escapes so the 
result is printable ASCII text.)

In [7]: data['description']
Out[7]: '\xec\xb9\xb4\xed\x86\xa8\xeb\xa6\xad 
\xed\x91\xb8\xeb\xa6\x84\xed\x84\xb0'

On the other hand, if I ask the interpreter explicitly to print the 
value, the \x escapes are not inserted and the correct characters are shown:

In [8]: print data['description']
카톨릭 푸름터

The parameter ensure_ascii=False prevents the JSON serializer from 
converting the individual bytes of UTF-8 to \u escapes.

Here again, showing the converted data directly uses repr() and shows \x 
escapes:

In [6]: dumps(data, ensure_ascii=False)
Out[6]: '{"description": "\xec\xb9\xb4\xed\x86\xa8\xeb\xa6\xad

If I print the result, I can see that it contains the correct characters:

In [17]: print dumps(data, ensure_ascii=False)
{"description": "카톨릭 푸름터"}

Kent


More information about the Tutor mailing list