[Tutor] improvements on a renaming script
bob gailer
bgailer at gmail.com
Mon Mar 10 03:50:02 CET 2014
On 3/9/2014 3:22 PM, street.sweeper at mailworks.org wrote:
> Hello all,
>
> A bit of background, I had some slides scanned and a 3-character
> slice of the file name indicates what roll of film it was.
> This is recorded in a tab-separated file called fileNames.tab.
> Its content looks something like:
>
> p01 200511_autumn_leaves
> p02 200603_apple_plum_cherry_blossoms
>
> The original file names looked like:
>
> 1p01_abc_0001.jpg
> 1p02_abc_0005.jpg
>
> The renamed files are:
>
> 200511_autumn_leaves_-_001.jpeg
> 200603_apple_plum_cherry_blossoms_-_005.jpeg
>
> The script below works and has done what I wanted, but I have a
> few questions:
>
> - In the get_long_names() function, the for/if thing is reading
> the whole fileNames.tab file every time, isn't it? In reality,
> the file was only a few dozen lines long, so I suppose it doesn't
> matter, but is there a better way to do this?
The "usual" way is to create a dictionary with row[0] contents as keys
and row[1] contents as values. Do this once per run. Then lookup each
glnAbbrev in the dictionary and return the corresponding value.
> - Really, I wanted to create a new sequence number at the end of
> each file name, but I thought this would be difficult. In order
> for it to count from 01 to whatever the last file is per set p01,
> p02, etc, it would have to be aware of the set name and how many
> files are in it. So I settled for getting the last 3 digits of
> the original file name using splitext(). The strings were unique,
> so it worked out. However, I can see this being useful in other
> places, so I was wondering if there is a good way to do this.
> Is there a term or phrase I can search on?
I'm sorry but I don't fully understand that paragraph. And why would
you need to know the number of files?
> - I'd be interested to read any other comments on the code.
> I'm new to python and I have only a bit of computer science study,
> quite some time ago.
Beware using tabs as indents. As rendered by Thunderbird they appear as
8 spaces which is IMHO overkill.
It is much better to use spaces. Most Python IDEs have an option to
convert tabs to spaces.
The Python recommendation is 4; I use 2.
> #!/usr/bin/env python3
>
> import os
> import csv
>
> # get longnames from fileNames.tab
> def get_long_name(glnAbbrev):
> with open(
> os.path.join(os.path.expanduser('~'),'temp2','fileNames.tab')
> ) as filenames:
> filenamesdata = csv.reader(filenames, delimiter='\t')
> for row in filenamesdata:
> if row[0] == glnAbbrev:
> return row[1]
>
> # find shortname from slice in picture filename
> def get_slice(fn):
> threeColSlice = fn[1:4]
> return threeColSlice
Writing a function to get a slice seems overkill also. Just slice in place.
> # get 3-digit sequence number from basename
> def get_bn_seq(fn):
> seq = os.path.splitext(fn)[0][-3:]
> return seq
>
> # directory locations
> indir = os.path.join(os.path.expanduser('~'),'temp4')
> outdir = os.path.join(os.path.expanduser('~'),'temp5')
>
> # rename
> for f in os.listdir(indir):
> if f.endswith(".jpg"):
> os.rename(
> os.path.join(indir,f),os.path.join(
> outdir,
> get_long_name(get_slice(f))+"_-_"+get_bn_seq(f)+".jpeg")
> )
>
> exit()
>
HTH - remember to reply-all so a copy goes to the list, place your
comments in-line as I did, and delete irrelevant text.
More information about the Tutor
mailing list