Changing filenames from Greeklish => Greek (subprocess complain)

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Jun 8 03:57:05 EDT 2013


On Thu, 06 Jun 2013 23:35:33 -0700, nagia.retsina wrote:

>> Working with bytes is only for when the file names are turned to
>> garbage. Your file names (some of them) are turned to garbage. Fix
>> them, and then use file names as strings.
> 
> Can't '~/data/apps/' is filled every day with more and more files which
> are uploaded via FileZilla client, which i think it behaves pretty much
> like putty, uploading filenames as greek-iso bytes.


Well, that is certainly a nuisance. Try something like this:

# Untested.

dir = b'/home/nikos/public_html/data/apps/'  # This must be bytes.
files = os.listdir(dir)
for name in files:
    pathname_as_bytes = dir + name
    for encoding in ('utf-8', 'iso-8859-7', 'latin-1'):
        try:
            pathname = pathname_as_bytes.decode(encoding)
        except UnicodeDecodeError:
            continue
        # Rename to something valid in UTF-8.
        if encoding != 'utf-8':
            os.rename(pathname_as_bytes, pathname.encode('utf-8'))
        assert os.path.exists(pathname)
        break
    else:
        # This only runs if we never reached the break.
        raise ValueError('unable to clean filename %r'%pathname_as_bytes)


-- 
Steven



More information about the Python-list mailing list