just a bug
Jarek Zgoda
jzgoda at o2.usun.pl
Fri May 25 10:54:22 EDT 2007
Maksim Kasimov napisał(a):
>> 'utf8' codec can't decode bytes in position 176-177: invalid data
>>>>> iMessage[176:178]
>> '\xd1]'
>>
>> And that's your problem. In general you can't just truncate a utf-8
>> encoded string anywhere and expect the result to be valid utf-8. The
>> \xd1 at the very end of your CDATA section is the first byte of a
>> two-byte sequence that represents some unicode code-point between \u0440
>> and \u047f, but it's missing the second byte that says which one.
>
>
> in previous message i've explain already that the situation widely
> appears with
> memory limited devices, such as mobile terminals of Nokia, SonyEriccson,
> Siemens and so on.
>
> and i've notice you that it is a part of a splited string.
No, it is not a part of string. It's a part of byte stream, split in a
middle of multibyte-encoded character.
You cann't get only dot from small letter "i" and ask the parser to
treat it as a complete "i".
--
Jarek Zgoda
http://jpa.berlios.de/
More information about the Python-list
mailing list