Encoding questions (continuation)

Steven D'Aprano steve+comp.lang.python at pearwood.info
Mon Jun 10 07:41:07 EDT 2013


On Mon, 10 Jun 2013 14:13:00 +0300, Νικόλαος Κούρας wrote:

> Τη Δευτέρα, 10 Ιουνίου 2013 1:42:25 μ.μ. UTC+3, ο χρήστης Andreas
> Perstinger έγραψε:
> 
>  >  >>> s = b'\xce\xb1'
>  >
>  >  >>> s[0]
>  >
>  > 206
> 
> 's' is a byte object, how can you treat it as a string asking to present
> you its first character?

That is not treating it as a string, and it does not present the first 
character. It presents the first byte, which is a number between 0 and 
255, not a character.

py> alist = [0xce, 0xb1]
py> alist[0]
206

Is that treating alist as a string? No, of course not. Strings are not 
the only object that have indexing object[position].


> 's' is a byte object, how can you treat it as a string asking to present
> you its first character?

You just asked that exact same question. Why ask it twice?


>  > A byte object is a sequence of bytes (= integer values) and support
> indexing
> 
> A sequeence of bystes is a a sequence of bits which is zeros and one's
> not integers.

Nikos, you fail basic computers. Time for you to step away from the 
computer, go to the library, and borrow a book about the basic 
fundamentals of how computers work. Perhaps something written for school 
children.

I am not saying this to insult you, or to be rude. But you are obviously 
struggling with the most basic concepts, like what a byte is. You need to 
go back to basics and learn the simple things, and perhaps if it is 
explained to you in your native language, you will understand it better.


>  > Because your method doesn't work.
>  > If you use all possible 256 bit-combinations to represent a valid
>  > character, how do you decide where to stop in a sequence of bytes?
> 
> How you mean? please provice an example so i can understand this.

I have already provided an example. Many other people have provided 
examples. Please read them.


>  > > EBCDIC and ASCII and Unicode are charactet sets, correct?
> 
>  > > iso-8859-1, iso-8859-7, utf-8, utf-16, utf-32 and so on are
> encoding methods, right?
> 
>  > Look at http://www.unicode.org/glossary/ for an explanation of all
> the terms
> 
> I did but docs confuse me even more. Can you pleas ebut it simple.

Nikos, if you can't be bothered to correct your spelling mistakes, why 
should we be bothered to answer your questions? It takes you half a 
second to fix a typo like "pleas ebut". It takes us five, ten, fifteen, 
twenty minutes to write an email explaining these concepts, and then you 
don't bother to read them and just ask the same question again. And 
again. And again.


> ps. i tried to post a reply to the thread i opend via thunderbird mail
> client, but not as a reply to somne other reply but as  new mail send to
> python list.
> because of that a new thread will be opened. How can i tell thunderbird
> to reply to the original thread and not start a new one?

By replying to an email in that thread.



-- 
Steven



More information about the Python-list mailing list