hex dump w/ or w/out utf-8 chars

wxjmfauth at gmail.com wxjmfauth at gmail.com
Sat Jul 13 03:56:52 EDT 2013


Le vendredi 12 juillet 2013 04:16:21 UTC+2, Chris Angelico a écrit :
> On Fri, Jul 12, 2013 at 4:42 AM,  <wxjmfauth at gmail.com> wrote:
> 
> > BTW, since
> 
> > when a serious coding scheme need an extermal marker?
> 
> >
> 
> 
> 
> All of them.
> 
> 
> 
> Content-type: text/plain; charset=UTF-8
> 
> 
> 
> ChrisA

------


No one.

You are confusing the knowledge of a coding scheme and the intrisinc
information a "coding scheme" *may* have, in a mandatory way, to work
properly. These are conceptualy two different things.

I am convinced you are not conceptually understanding utf-8 very well.
I wrote many times, "utf-8 does not produce bytes, but Unicode Encoding
Units".

A similar coding scheme: iso-6937 . 

Try to write an editor, a text widget, with with a coding
scheme like the Flexible String Represenation. You will
quickly notice, it is impossible (understand correctly).
(You do not need a computer, just a sheet of paper and a pencil)
Hint: what is the character at the caret position?

jmf





More information about the Python-list mailing list