hex dump w/ or w/out utf-8 chars

Chris Angelico rosuav at gmail.com
Mon Jul 8 13:52:17 EDT 2013


On Tue, Jul 9, 2013 at 3:31 AM,  <ferdy.blatsco at gmail.com> wrote:
> Unfortunately (as probably I told you before) I will never pass to
> Python 3...  Guido should not always listen only to gurus like him...
> I don't like Python as before...starting from OOP and ending with codecs
> like utf-8. Regarding OOP, much appreciated expecially by experts, he
> could use python 2 for hiding the complexities of OOP (improving, as an
> effect, object's code hiding) moving classes and objects to
> imported methods, leaving in this way the programming style to the
> well known old style: sequential programming and functions.
> About utf-8... the same solution: keep utf-8 but for the non experts, add
> methods to convert to solutions which use the range 128-255 of only one
> byte (I do not give a damn about chinese and "similia"!...)
> I know that is a lost battle (in italian "una battaglia persa")!

Well, there won't be a Python 2.8, so you really should consider
moving at some point. Python 3.3 is already way better than 2.7 in
many ways, 3.4 will improve on 3.3, and the future is pretty clear.
But nobody's forcing you, and 2.7.x will continue to get
bugfix/security releases for a while. (Personally, I'd be happy if
everyone moved off the 2.3/2.4 releases. It's not too hard supporting
2.6+ or 2.7+.)

The thing is, you're thinking about UTF-8, but you should be thinking
about Unicode. I recommend you read these articles:

http://www.joelonsoftware.com/articles/Unicode.html
http://unspecified.wordpress.com/2012/04/19/the-importance-of-language-level-abstract-unicode-strings/

So long as you are thinking about different groups of characters as
different, and wanting a solution that maps characters down into the
<256 range, you will never be able to cleanly internationalize. With
Python 3.3+, you can ignore the differences between ASCII, BMP, and
SMP characters; they're all just "characters". Everything works
perfectly with Unicode.

ChrisA



More information about the Python-list mailing list