inserting Unicode character in dictionary - Python

gitaziabari at gmail.com gitaziabari at gmail.com
Sat Oct 18 20:45:59 EDT 2008


On Oct 17, 2:38 pm, Joe Strout <j... at strout.net> wrote:
> Thanks for the answers.  That clears things up quite a bit.
>
> >> What if your source file is set to utf-8?  Do you then have a proper
> >> UTF-8 string, but the problem is that none of the standard Python
> >> library methods know how to properly interpret UTF-8?
>
> > Well, the decode method knows how to decode that bytes into a
> > `unicode`
> > object if you call it with 'utf-8' as argument.
>
> OK, good to know.
>
> >> 4. In Python 3.0, this silliness goes away, because all strings are
> >> Unicode by default.
>
> > Yes and no.  The problem just shifts because at some point you get
> > into
> > similar troubles, just in the other direction.  Data enters the
> > program
> > as bytes and must leave it as bytes again, so you have to deal with
> > encodings at those points.
>
> Yes, but that's still much better than having to litter your code with
> 'u' prefixes and .decode calls and so on.  If I'm using a UTF-8-savvy
> text editor (as we all should be doing in the 21st century!), and type
> "foo = '2π'", I should get a string containing a '2' and a pi
> character, and all the text operations (like counting characters,
> etc.) should Just Work.
>
> When I read and write files or sockets or whatever, of course I'll
> have to think about what encoding the text should be... but internal
> to my own source code, I shouldn't have to.
>
> I understand the need for a transition strategy, which is what we have
> in 2.x, and that's working well enough.  But I'll be glad when it's
> over.  :)
>
> Cheers,
> - Joe

Thanks for the answers. The following factors should be considerd when
dealing with unicode characters in python:
1. Declaring # -*- coding: utf-8 -*- at the top of script
2. Opening files with appropriate encoding:
txt = codecs.open (filename, 'w+', encoding='utf-8')

My program works fine now. There is no specific way of adding unicode
characters in list or dictionaies. The character itself has to be in
unicode.

Cheers,

Gita



More information about the Python-list mailing list