Thanks comp.lang.python!!!

Andy Jewell andy at wild-flower.co.uk
Sun Jul 20 10:25:16 EDT 2003


On Sunday 20 Jul 2003 4:40 am, hokiegal99 wrote:
> While migrating a lot of Mac OS9 machines to PCs, we encountered several
> problems with Mac file and directory names. Below is a list of the
> problems:
>
> 1. Macs allow characters in file and dir names that are not acceptable
> on PCs. Specifically this set of characters [*<>?\|/]
>
> 2. Mac files and dirs that contained a "/" in their names would ftp to
> the server OK, but the "/" would be translated to "%2f". So, a Mac file
> named 7/19/03 would ftp as 7%2f19%2f03... not a very desirable filename,
> especially when there are hundreds of them.
>
> 3. The last problem was spaces at the beginning and ending of file and
> dir names. We encountered hundreds of files and dirs like this on the
> Macs. They would ftp up to the Linux server OK but when the Windows PC
> attempted to d/l them, the ftp transaction would stop and complain about
> not finding files whenever it tried to transfer a file. Dirs with spaces
> at the beginning or ending would literally crash the ftp client.
>

Shame on Apple for allowing subversive filenames! ;-)

> These were problems that we did not expect. So, we wrote a script to
> clean up these names. Since all of the files were bring uploaded to a
> Linux ftp server, we decided to do the cleaning there. Python is a
> simple, easily readable programming language, so we chose to use it.
> Long story short, attached to this email is the script. If anyone can
> use it to address Mac to PC migrations, feel free to.
>

You never do, until they bite you!

> The only caveat is that the script uses os.walk. I don't think Python
> 2.2.x comes with os.walk. To address this, we d/l 2.3b2 and have used it
> extensively with this script w/o any problems. And, someone here on
> comp.lang.python told me that os.walk could be incorporated into 2.2.x
> too, but I never tried to do that as 2.3b2 worked just fine.
>

There is os.path.walk, instead.

> Thanks to everyone who contributed to this script. Much of it is
> straight from advice that I received here. Also, if anyone sees how it
> can be improved, let me know. For now, I'm satisfied with it as it works
> "well enough" for what I need it to do, however, I'm trying to become a
> better programmer so I appreciated feedback from those who are much more
> experienced than I am.
>

You're welcome.  We all come here to learn... :-))

> Special Thanks to Andy Jewell, Bengt Richter and Ethan Mindlace Fremen
> as they wrote much of the code initially and gave a lot of great tips!!!

:-))

Some additional comments on your source-code, if I may.  The following points 
will help you make your program much more efficient:

1) You'd normally place your functions in a separate section, usually at the 
top of your program, rather than 'in the middle'.  It will work fine this 
way, but it's a bit less readable.

2) There seem to be some indentation anomalies, probably because of using a 
combination of tabs and spaces.  This WILL bite you sometime in the future: 
best to stick to one or t'other, preferably just spaces: the convention in 
Python is to indent by 4 spaces for each 'suite',  or logical 'block' of 
code.

3) I'm not sure you quite get the recursive bit yet!  Simply calling your 
function lots of times in succession doesn't cut it... all that happens is 
that each time you call it, it does the same thing, effectively doing the job 
10 times...  What you'd have to do is call the function from *WITHIN* itself, 
i.e. in the body, like:

def recurse(dir,depth=0):
    """ walk dir's subdirectories recursively, printing their name """
    # process list of files in dir...
    for entry in os.listdirs(dir):
        # if the current one is a directory...
        if os.path.isdir(os.join(dir,entry)):
            print "    "*depth+"+"+entry
            # recurse (call ourselves)
            recurse(os.join(dir,entry),depth+1)

** NOTE: Looking at the docs, if you use os.walk, you don't need to do the 
recursion yourself, as os.walk does it for you!

3) You're still repeating yourself several times, too.  You can get away with 
JUST ONE os.walk() loop:

for root, dirs, files in os.walk(setpath):
    for thisfile in dirs+files:
        badchars=bad.findall(thisfile)
        newname=thisfile.strip()  # strip off leading and trailing whitespace
        # replace any bad characters...
        for badchar in badchars:
            newname=neaname.replace(badchar,"-")
        # rename thisfile ONLY if newname is different...
        if newname != thisfile: # check if it's changed:
            print renaming thisfile,newname,"in",root
            os.rename(os.path.join(root,thisfile),os.path.join(root,newname)

!! that replaces what your four for loops do... 8-0

hope you find this useful :-)

-andyj        






More information about the Python-list mailing list