Changing filenames from Greeklish => Greek (subprocess complain)

jmfauth wxjmfauth at gmail.com
Thu Jun 6 01:54:46 EDT 2013


On 5 juin, 19:43, Νικόλαος Κούρας <nikos.gr... at gmail.com> wrote:
> Ôç ÔåôÜñôç, 5 Éïõíßïõ 2013 8:56:36 ð.ì. UTC+3, ï ÷ñÞóôçò Steven D'Aprano Ýãñáøå:
>
> Somehow, I don't know how because I didn't see it happen, you have one or
> more files in that directory where the file name as bytes is invalid when
> decoded as UTF-8, but your system is set to use UTF-8. So to fix this you
> need to rename the file using some tool that doesn't care quite so much
> about encodings. Use the bash command line to rename each file in turn
> until the problem goes away.
>
> But renaming ia hsell access like 'mv 'Euxi tou Ihsou.mp3' 'Åõ÷Þ ôïõ Éçóïõ.mp3' leade to that unknown encoding of this bytestream '\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3'
>
> But please tell me Steven what linux tool you think it can encode the weird filename to proper 'Åõ÷Þ ôïõ Éçóïõ.mp3' utf-8?
>
> or we cna write a script as i suggested to decode back the bytestream using all sorts of available decode charsets boiling down to the original greek letters.

---------------

see
http://bugs.python.org/issue13643, msg msg149949 - (view) 	Author:
Antoine Pitrou (pitrou)


Quote:

So, you're complaining about something which works, kind of:

$ touch héhé
$ LANG=C python3 -c "import os; print(os.listdir())"
['h\udcc3\udca9h\udcc3\udca9']

> This makes robustly working with non-ascii filenames on different
> platforms needlessly annoying, given no modern nix should have problems
> just using UTF-8 in these cases.

So why don't these supposedly "modern" systems at least set the
appropriate environment variables for Python to infer the proper
character encoding?
(since these "modern" systems don't have a well-defined encoding...)

Answer: because they are not modern at all, they are antiquated,
inadapted and obsolete pieces of software designed and written by
clueless Anglo-American people. Please report bugs against these
systems. The culprit is not Python, it's the Unix crap and the utterly
clueless attitude of its maintainers ("filesystems are just bytes",
yeah, whatever...).

jmf



More information about the Python-list mailing list