[Tutor] Renaming files

Magnus Lycka magnus@thinkware.se
Fri Jan 24 05:15:01 2003


At 22:50 2003-01-23 -0500, Gus Tabares wrote:
>         I'm trying to rename files that have endings as such:
>blah.mp3.bunchofchars . I'm trying to strip off the extra chars and just
>have *.mp3. The files were fed into a list via the os.listdir()
>function. The searching I've done so far has come up empty. Should I be
>using the string module for something like this or is this a simple
>slice operation I'm missing? Any help is greatly appreciated.

You can certainly use os.rename as J"o suggested, but I don't know
what he wants .strip for, it removes whitespace, and os.path.basename
strips off the part of a path that precedes a file name, and you don't
get that from os.listdir, so that's not very useful here either.
On the other hand, os.path.splitext *might* be right...

 >>> import os
 >>> help(os.path.splitext)
Help on function splitext in module ntpath:
splitext(p)
     Split the extension from a pathname.

     Extension is everything from the last dot to the end.
     Return (root, ext), either part may be empty.

It depends a little on whether we know that there are stuff after
.mp3 or not. If the a file might actually end with '.mp3', we
don't want *that* extension removed. Also, if there are dots in
the string *after* '.mp3.' we still want ALL of it removed.

I'd feel safer looking explicitly for .mp3 in the name in his case.
Let's experiment a bit...

 >>> name = "blah.mp3.bunchofchars"
 >>> end = '.mp3'
 >>> name.find(end)
4
 >>> len(end)
4
 >>> #Hm...
 >>> print name[:name.find(end)+len(end)]
blah.mp3
 >>> for name in ['blaha.mp3', 'sdffg.mp3.sdf', 'not.an.mp.3', 'x.mp3.x.y']:
...     print "Find trick", name[:name.find(end)+len(end)]
...     print "Splitext", os.path.splitext(name)[0]
...
Find trick blaha.mp3
Splitext blaha
Find trick sdffg.mp3
Splitext sdffg.mp3
Find trick not
Splitext not.an.mp
Find trick x.mp3
Splitext x.mp3.x

So:

Find trick blaha.mp3
Splitext blaha

In this case os.path.splitext removed .mp3. Bad!

Find trick sdffg.mp3
Splitext sdffg.mp3

Both ok.

Find trick not
Splitext not.an.mp

Both messed up files that don't contain .mp3 at all. :(

Find trick x.mp3
Splitext x.mp3.x

os.path.splitext failed here too...

Hm... It seems the find version is more robust, but it will just
return the first three characters if there is no '.mp3' in the
file name. If we got the data from "os.listdir('*.mp3*') I guess
that's ok, but if we want both belt and suspenders we might add
a check for that.

 >>> for name in ['blaha.mp3', 'sdffg.mp3.sdf', 'not.an.mp.3', 'x.mp3.x.y']:
...     newName = name[:name.find(end)+len(end)]
...     if newName.endswith('.mp3'):
...             print "Ok, put this in os.rename:", newName
...     else:
...             print "No, I don't like this name:", newName
...
Ok, put this in os.rename: blaha.mp3
Ok, put this in os.rename: sdffg.mp3
No, I don't like this name: not
Ok, put this in os.rename: x.mp3

The newName.endswith('.mp3') check seems to fit our purposes very
well. If we really feel that os.listdir SHOULD only have given us
file names containing .mp3, we might want to turn this into an
assert:

 >>> for name in ['blaha.mp3', 'sdffg.mp3.sdf', 'not.an.mp.3', 'x.mp3.x.y']:
...     newName = name[:name.find(end)+len(end)]
...     assert newName.endswith('.mp3')
...     print "Ok, put this in os.rename:", newName
...
Ok, put this in os.rename: blaha.mp3
Ok, put this in os.rename: sdffg.mp3
Traceback (most recent call last):
   File "<interactive input>", line 3, in ?
AssertionError

I'm sure you can loop over os.listdir(something) and exchange the
print for os.rename(name, newName)...

Oh yes, one more thing. This won't work of course:

for name in os.listdir('/some/path/to/wherever/*.mp3*'):
     ...
     os.rename(name, newName)

os.rename won't know what directory you pointed at in os.listdir...
Obvious but easy to miss. It's probably easiest to do.

os.chdir('/some/path/to/wherever/')
for name in os.listdir('*.mp3*'):
...

Another option would obviously be to use good old regular expressions.

 >>> names = "\n".join(['blaha.mp3', 'sdffg.mp3.sdf', 'not.an.mp.3', 
'x.mp3.x.y'])
 >>> import re
 >>> re.findall(r'(.*\.mp3)', names)
['blaha.mp3', 'sdffg.mp3', 'x.mp3']



-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus@thinkware.se