[Python-Dev] Python3 "complexity"

Paul Moore p.f.moore at gmail.com
Thu Jan 9 23:54:23 CET 2014


On 9 January 2014 22:08, Ethan Furman <ethan at stoneleaf.us> wrote:
> For example:  b'\x01\x00\xd1\x80\xd1\83\xd0\x80'
>
> If that were decoded using latin1 how would I then get the first two bytes
> to the integer 256 and the last six bytes to their Cyrillic meaning?
> (Apologies for not testing myself, short on time.)

I cannot conceive why you would. Slice the bytes then use
struct.unpack on the first 2 bytes and decode on the last 6. We're
talking about using latin1 for cases where you want to treat the text
as essentially ascii (with a few bits of binary junk you want to
ignore). Please don't take away the message that latin1 makes things
"just like Python 2.X" - that's completely the wrong idea.

Paul


More information about the Python-Dev mailing list