using functions and file renaming problem

Sat Jul 19 09:36:41 EDT 2003

My scripts aren't long and complex, so I don't really *need* to use 
functions. But the idea of using them is appealing to me because it 
seems the right thing to do from a design point of view. I can see how 
larger, more complex programs would get out of hand if the programmer 
did not use functions so they'd be absolutely necessary there. But if 
they allow larger programs to have a better overall design that's more 
compact and readable (like your examples showed) then one could argue 
that they would do the same for smaller, simplier programs too.

Thanks for the indepth explanation. It was very helpful. I'm going to 
try using functions within my fix_files.py script.

Andy Jewell wrote:
> On Friday 18 Jul 2003 11:16 pm, hokiegal99 wrote:
> 
>>Thanks again for the help Andy! One last question: What is the advantage
>>of placing code in a function? I don't see how having this bit of code
>>in a function improves it any. Could someone explain this?
>>
>>Thanks!
> 
> 8<--- (old quotes)
> 
> The 'benefit' of functions is only really reaped when you have a specific need 
> for them!  You don't *have* to use them if you don't *need* to (but they can 
> still improve the readability of your code).  
> 
> Consider the following contrived example:
> 
> ----------8<------------
> # somewhere in the dark recesses of a large project...
> . . . 
> for filename in os.listdir(cfg.userdir):
>     newname = filename
>     for ch in cfg.badchars:
>         newname.replace(ch,"-")
>     if newname != filename:
>         os.rename(os.path.join(cfg.userdir,filename),
>                            os.path.join(cfg.userdir,newname)
> . . .
> . . .
> # in another dark corner...
> 
> . . . 
> for filename in os.listdir(cfg.tempdir):
>     newname = filename
>     for ch in cfg.badchars:
>         newname.replace(ch,"-")
>     if newname != filename:
>         os.rename(os.path.join(cfg.userdir,filename),
>                            os.path.join(cfg.userdir,newname)
> . . .
> # somewhere else...
> 
> . . . 
> for filename in os.listdir(cfg.extradir):
>     newname = filename
>     for ch in cfg.badchars: 
>         newname.replace(ch,"-")
>     if newname != filename:
>         os.rename(os.path.join(cfg.userdir,filename),
>                            os.path.join(cfg.userdir,newname)
> . . .
> ----------8<------------
> 
> See the repetition? ;-)
> 
> Imagine a situation where you need to do something far more complicated over, 
> and over again...  It's not very programmer efficient, and it makes the code 
> longer, too - thus costing more to write (time) and more to store (disks).
> 
> Imagine having to change the behaviour of this 'hard-coded' routine, and what 
> would happen if you missed one... however, if it is in a function, you only 
> have *one* place to change it.
> 
> When we generalise the algorithm and put it into a function we can do:
> 
> ----------8<------------
> 
> . . .
> . . .
> 
> # somewhere near the top of the project code...
> def cleanup_filenames(dir):
> 
>     """ renames any files within dir that contain bad characters 
>         (ie. ones in cfg.badchars).  Does not walk the directory tree.
>     """
> 
>     for filename in os.listdir(dir):
>         newname = filename
>         for ch in cfg.badchars:
>             newname.replace(ch,"-")
>         if newname != filename:
>             os.rename(os.path.join(cfg.userdir,filename),
>                                os.path.join(cfg.userdir,newname)
> 
> . . .
> . . .
> 
> # somewhere in the dark recesses of a large project...
> . . . 
> cleanup_filenames(cfg.userdir)
> . . .
> . . .
> # in another dark corner...
> . . . 
> cleanup_filenames(cfg.tempdir)
> . . .
> # somewhere else...
> . . . 
> cleanup_filenames(cfg.extradir)
> . . .
> 
> ----------8<------------
> 
> Even in this small, contrived example, we've saved about 13 lines of code (ok, 
> that's notwithstanding the blank lines and  the """ docstring """ at the top 
> of the function).
> 
> There's another twist, too.  In the docstring for cleanup_filenames it says 
> "Does not walk the directory tree." because we didn't code it to deal with 
> subdirectories.  But we could, without using os.walk...
> 
> Directories form a tree structure, and the easiest way to process trees is by 
> using /recursion/, which means functions that call themselves.  An old 
> programmer's joke is this:
> 
>     Recursion, defn.  [if not understood] see Recursion.
> 
> Each time you call a function, it gets a brand new environment, called the 
> 'local scope'.  All variables inside this scope are private; they may have 
> the same names, but they refer to different objects.  This can be really 
> handy...
> 
> ----------8<------------
> 
> def cleanup_filenames(dir):
> 
>     """ renames any files within dir that contain bad characters 
>         (ie. ones in cfg.badchars).  Walks the directory tree to process
>         subdirectories.
>     """
> 
>     for filename in os.listdir(dir):
>         newname = filename
>         for ch in cfg.badchars:
>             newname.replace(ch,"-")
>         if newname != filename:
>             os.rename(os.path.join(cfg.userdir,filename),
>                                os.path.join(cfg.userdir,newname)
>         # recurse if subdirectory...
>         if os.path.isdir(os.path.join(cfg.userdir,newname)):
>             cleanup_filenames(os.path.join(cfg.userdir,newname))
> 
> ----------8<------------
> 
> This version *DOES* deal with subdirectories...  with only two extra lines, 
> too!  Trying to write this without recursion would be a nightmare (even in 
> Python).
> 
> A very important thing to note, however, is that there is a HARD LIMIT on the 
> number of times a function can call itself, called the RecursionLimit:
> 
> ----------8<------------
> 
>>>>n=1
>>>>def rec():
>>>
>         n=n+1
>         rec()
> 
> 
>>>>rec()
>>>
> . . .
> (huge traceback list)
> . . .
> RuntimeError: maximum recursion limit reached.
> 
>>>>n
>>>
> 991
> ----------8<------------
> 
> Another very important thing about recursion is that a recursive function 
> should *ALWAYS* have a 'get-out-clause', a condition that stops the 
> recursion.  Guess what happens if you don't have one ... ;-) 
> 
> Finally (at least for now), functions also provide a way to break down your 
> code into logical sections.  Many programmers will write the higher level 
> functions first, delegating 'complicated bits' to further sub-functions as 
> they go, and worry about implementing them once they've got the overall 
> algorithm finished.  This allows one to concentrate on the right level of 
> detail, rather than getting bogged down in the finer points: you just make up 
> names for functions that you're *going* to implement later.  Sometimes, you 
> might make a 'stub' like:
> 
> def doofer(dooby, doo):
>     pass
> 
> so that your program is /syntactically/ correct, and will run (to a certain 
> degree).  This allows debugging to proceed before you have written 
> everything.  You'd do this for functions which aren't *essential* to the 
> program, but maybe add 'special features', for example, additonal 
> error-checking or output formatting.
> 
> A sort of extension of the function  idea is 'modules', which make functions 
> and other objects available to other 'client' programs.  When you say:
> 
> import os
> 
> you are effectively adding all the functions and objects of the os module into 
> your own program, without having to re-write them.  This enables programmers
> to share their functions and other code as convenient 'black boxes'.  Modules, 
> however, are a slightly more advanced topic.
> 
> 
> Hope that helps.
> 
> -andyj
> 
> 
>