Changing filenames from Greeklish => Greek (subprocess complain)

MRAB python at mrabarnett.plus.com
Wed Jun 5 14:32:15 EDT 2013


On 05/06/2013 18:43, Νικόλαος Κούρας wrote:
> Τη Τετάρτη, 5 Ιουνίου 2013 8:56:36 π.μ. UTC+3, ο χρήστης Steven D'Aprano έγραψε:
>
> Somehow, I don't know how because I didn't see it happen, you have one or
> more files in that directory where the file name as bytes is invalid when
> decoded as UTF-8, but your system is set to use UTF-8. So to fix this you
> need to rename the file using some tool that doesn't care quite so much
> about encodings. Use the bash command line to rename each file in turn
> until the problem goes away.
>
> But renaming ia hsell access like 'mv 'Euxi tou Ihsou.mp3' 'Ευχή του Ιησου.mp3' leade to that unknown encoding of this bytestream '\305\365\367\336\ \364\357\365\ \311\347\363\357\375.mp3'
>
> But please tell me Steven what linux tool you think it can encode the weird filename to proper 'Ευχή του Ιησου.mp3' utf-8?
>
> or we cna write a script as i suggested to decode back the bytestream using all sorts of available decode charsets boiling down to the original greek letters.
>
Using Python, I think you could get the filenames using os.listdir,
passing the directory name as a bytestring so that it'll return the
names as bytestrings.

Then, for each name, you could decode from its current encoding and
encode to UTF-8 and rename the file, passing the old and new paths to
os.rename as bytestrings.



More information about the Python-list mailing list