Changing filenames from Greeklish => Greek (subprocess complain)

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sun Jun 9 07:50:02 EDT 2013


On Sun, 09 Jun 2013 10:55:43 +0200, Lele Gaifax wrote:

> Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
> 
>> On Sat, 08 Jun 2013 22:09:57 -0700, nagia.retsina wrote:
>>
>>> chr('A') would give me the mapping of this char, the number 65 while
>>> ord(65) would output the char 'A' likewise.
>>
>> Correct. Python uses Unicode, where code-point 65 ("ordinal value 65")
>> means letter "A".
> 
> Actually, that's the other way around:
> 
>     >>> chr(65)
>     'A'
>     >>> ord('A')
>     65

/facepalm 

Of course you are right.


>>> What would happen if we we try to re-encode bytes on the disk? like
>>> trying:
>>> 
>>> s = "νίκος"
>>> utf8_bytes = s.encode('utf-8')
>>> greek_bytes = utf_bytes.encode('iso-8869-7')
>>> 
>>> Can we re-encode twice or as many times we want and then decode back
>>> respectively lke?
>>
>> Of course. [...]

> Uhm, no: "encode" transforms a Unicode string into an array of bytes,
> "decode" does the opposite transformation. You cannot do the former on
> an "arbitrary" array of bytes:

And two for two. I misunderstood Nikos' question.

As you point out, no, Python 3 will not allow you to re-encode bytes. You 
must first decode them to a string first, then encode them using a 
different encoding. (I thought that this was was Nikos actually meant, 
but I on re-reading his question more closely, that's not actually what 
he asked.)

Sorry for any confusion.


-- 
Steven



More information about the Python-list mailing list