Script Optimization

Wed May 7 01:04:58 EDT 2008

On May 4, 10:04 pm, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Sun, 04 May 2008 17:01:15 -0300, lev <levlozh... at gmail.com> escribió:
>
> >> * Change indentation from 8 spaces to 4
> >     I like using tabs because of the text editor I use, the script at
> > the end is with 4 though.
>
> Can't you configure it to use 4 spaces per indent - and not use "hard" tabs?
>
> >> * Remove useless "pass" and "return" lines
> >     I replaced the return nothing lines with passes, but I like
> > keeping them in case the indentation is ever lost - makes it easy to
> > go back to original indentation
>
> I can't think of a case when only indentation "is lost" - if you have a crash or something, normally you lose much more than indentation... Simple backups or a SCM system like cvs/svn will help. So I don't see the usefulness of those "pass" statements; I think that after some time using Python you'll consider them just garbage, as everyone else.
>
> >> * Temporarily change broken "chdir" line
> >     removed as many instances of chdir as possible (a few useless ones
> > to accomodate the functions - changed functions to not chdir as much),
> > that line seems to work... I made it in case the script is launched
> > with say: 'python somedir\someotherdir\script.py' rather than 'python
> > script.py', because I need it to work in it's own and parent
> > directory.
>
> You can determine the directory where the script resides using
>
> import os
> basedir = os.path.dirname(os.path.abspath(__file__))
>
> This way it doesn't matter how it was launched. But execute the above code as soon as possible (before any chdir)
>
> >     checksums = open(checksums, 'r')
> >     for fline in checksums.readlines():
>
> You can directly iterate over the file:
>
>      for fline in checksums:
>
> (readlines() reads the whole file contents in memory; I guess this is not an issue here, but in other cases it may be an important difference)
> Although it's perfectly valid, I would not reccomend using the same name for two different things (checksums refers to the file name *and* the file itself)
>
> >     changed_files_keys = changed_files.keys()
> >     changed_files_keys.sort()
> >     missing_files.sort()
> >     print '\n'
> >     if len(changed_files) != 0:
> >         print 'File(s) changed:'
> >         for key in changed_files_keys:
>
> You don't have to copy the keys and sort; use the sorted() builtin:
>
>      for key in sorted(changed_files.iterkeys()):
>
> Also, "if len(changed_files) != 0" is usually written as:
>
>      if changed_files:
>
> The same for missing_files.
>
> >         for x in range(len(missing_files)):
> >             print '\t', missing_files[x]
>
> That construct range(len(somelist)) is very rarely used. Either you don't need the index, and write:
>
> for missing_file in missing_files:
>      print '\t', missing_file
>
> Or you want the index too, and write:
>
> for i, missing_file in enumerate(missing_files):
>      print '%2d: %s' % (i, missing_file)
>
> > def calculate_checksum(file_name):
> >     file_to_check = open(file_name, 'rb')
> >     chunk = 8196
>
> Any reason to use such number? 8K is 8192; you could use 8*1024 if you don't remember the value. I usually write 1024*1024 when I want exactly 1M.
>
> --
> Gabriel Genellina

Thank you Gabriel, I did not know about a number of the commands you
posted, the use of 8196 was error on my part. I will change the script
to reflect your corrections later tonight, I have another project I
need to finish/comment/submit for corrections later on, so I will be
using the version of the script that I will come up with tonight.

Thank you for your invaluable advice,
The python community is the first online community that I have had
this much help from, Thank you all.