How to print first(national) char from unicode string encoded in utf-8?

sniipe at gmail.com sniipe at gmail.com
Mon Sep 1 09:25:17 EDT 2008


On 1 Wrz, 15:10, "Marco Bizzarri" <marco.bizza... at gmail.com> wrote:
> 2008/9/1  <sni... at gmail.com>:
>
>
>
> > Hi,
>
> > I have a problem with unicode string in Pylons templates(Mako). I will
> > print first char from my string encoded in UTF-8 and urllib.quote(),
> > for example string 'Łukasz':
>
> > ${urllib.unquote(c.user.firstName).encode('latin-1')[0:1]}
>
> > and I received this information:
>
> > <type 'exceptions.UnicodeDecodeError'>: 'utf8' codec can't decode byte
> > 0xc5 in position 0: unexpected end of data
>
> > When I change from [0:1] to [0:2] everything is ok. I think it is
> > because of unicode and encoding utf-8(2 bytes).
>
> > How to resolve this problem?
>
> > Best regards
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
> First: you're talking about utf8 encoding, but you've written latin1
> encoding. Even though I do not know Mako templates, there should be no
> problem in your snippet of code, if encoding is latin1, at least for
> what I can understand.
>
> Do not assume utf8 is a two byte encoding; utf8 is a variable length
> encoding. Indeed,
>
> 'a' encoded as utf8 is 'a' (one byte)
>
> 'à' encode as utf8 is '\xc3\xa0' (two bytes).
>
> Can you explain what you're trying to accomplish (rather than how
> you're tryin to accomplish it) ?
>
> Regards
> Marco
>
> --
> Marco Bizzarrihttp://notenotturne.blogspot.com/http://iliveinpisa.blogspot.com/

When I do ${urllib.unquote(c.user.firstName)} without encoding to
latin-1 I got different chars than I will get: no Łukasz but Łukasz



More information about the Python-list mailing list