Script Optimization
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Mon May 5 01:04:53 EDT 2008
En Sun, 04 May 2008 17:01:15 -0300, lev <levlozhkin at gmail.com> escribió:
>> * Change indentation from 8 spaces to 4
> I like using tabs because of the text editor I use, the script at
> the end is with 4 though.
Can't you configure it to use 4 spaces per indent - and not use "hard" tabs?
>> * Remove useless "pass" and "return" lines
> I replaced the return nothing lines with passes, but I like
> keeping them in case the indentation is ever lost - makes it easy to
> go back to original indentation
I can't think of a case when only indentation "is lost" - if you have a crash or something, normally you lose much more than indentation... Simple backups or a SCM system like cvs/svn will help. So I don't see the usefulness of those "pass" statements; I think that after some time using Python you'll consider them just garbage, as everyone else.
>> * Temporarily change broken "chdir" line
> removed as many instances of chdir as possible (a few useless ones
> to accomodate the functions - changed functions to not chdir as much),
> that line seems to work... I made it in case the script is launched
> with say: 'python somedir\someotherdir\script.py' rather than 'python
> script.py', because I need it to work in it's own and parent
> directory.
You can determine the directory where the script resides using
import os
basedir = os.path.dirname(os.path.abspath(__file__))
This way it doesn't matter how it was launched. But execute the above code as soon as possible (before any chdir)
> checksums = open(checksums, 'r')
> for fline in checksums.readlines():
You can directly iterate over the file:
for fline in checksums:
(readlines() reads the whole file contents in memory; I guess this is not an issue here, but in other cases it may be an important difference)
Although it's perfectly valid, I would not reccomend using the same name for two different things (checksums refers to the file name *and* the file itself)
> changed_files_keys = changed_files.keys()
> changed_files_keys.sort()
> missing_files.sort()
> print '\n'
> if len(changed_files) != 0:
> print 'File(s) changed:'
> for key in changed_files_keys:
You don't have to copy the keys and sort; use the sorted() builtin:
for key in sorted(changed_files.iterkeys()):
Also, "if len(changed_files) != 0" is usually written as:
if changed_files:
The same for missing_files.
> for x in range(len(missing_files)):
> print '\t', missing_files[x]
That construct range(len(somelist)) is very rarely used. Either you don't need the index, and write:
for missing_file in missing_files:
print '\t', missing_file
Or you want the index too, and write:
for i, missing_file in enumerate(missing_files):
print '%2d: %s' % (i, missing_file)
> def calculate_checksum(file_name):
> file_to_check = open(file_name, 'rb')
> chunk = 8196
Any reason to use such number? 8K is 8192; you could use 8*1024 if you don't remember the value. I usually write 1024*1024 when I want exactly 1M.
--
Gabriel Genellina
More information about the Python-list
mailing list