"More About Unicode in Python 2 and 3"

Mark Janssen dreamingforward at gmail.com
Mon Jan 6 14:21:44 EST 2014


> The argument is that a very important, if small, subset a data manipulation
> become very painful in Py3.  Not impossible, and not difficult, but painful
> because the mental model and the contortions needed to get things to work
> don't sync up anymore.

You are confused.  Please see my reply to you on the bytestring type thread.

> Painful because Python is, at heart, a simple and
> elegant language, but with the use-case of embedded ascii in binary data
> that elegance went right out the window.

It went out the window only because the Object model with the
type/class unification was wrong.  It was fine before.

Mark

>> It can't be both things. It's either bytes or it's text.
>
> Of course it can be:
>
> 0000000: 0372 0106 0000 0000 6100 1d00 0000 0000  .r......a.......
> 0000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0000020: 4e41 4d45 0000 0000 0000 0043 0100 0000  NAME.......C....
> 0000030: 1900 0000 0000 0000 0000 0000 0000 0000  ................
> 0000040: 4147 4500 0000 0000 0000 004e 1a00 0000  AGE........N....
> 0000050: 0300 0000 0000 0000 0000 0000 0000 0000  ................
> 0000060: 0d1a 0a                                  ...
>
> And there we are, mixed bytes and ascii data.

No, you are printing a debug output which shows both.  That's called CHEATING.

Mark



More information about the Python-list mailing list