automated comparison tool

Steve D'Aprano steve+python at pearwood.info
Tue Sep 20 20:48:06 EDT 2016


On Wed, 21 Sep 2016 07:20 am, Andrew Clark wrote:

> I've restarted my code so many times i no longer have a working version of
> anything.


*Restarting* your code doesn't change it and cannot possibly stop it from
working. My guess is that you actually mean that you've made so many random
edits to your code, without understanding or thinking about what you are
doing, that you've broken it irreversibly.

I think you've learned a few things:

(1) Debugging by making random perturbations to code should be a last
resort, and even then, only done with the greatest of care.

(2) Use version control so you can roll back changes.

(3) Or at least make a backup copy of a known working version of your code.

(4) Most importantly, never make so many changes to your ONLY copy of the
code in one go that you can break it so badly that you can't go back.


You've described your problem quite well, and nicely broken it up into
pieces. I suggest you take each piece, and try to write the code for it
independently of the others. Start with the first piece:

    "access remote directories"


Okay, how are they accessible? Are they just network drives? Then that's
simple: you can access them as if they were local directories. What's the
path name of the network drive(s)? Simply use that. Problem solved.

If not, then you need to decide how to access them: over SSH, FTP,
sneaker-net, whatever. The choice you make here is going to impact the
complexity of your code. Think about carefully.

Once you have working code that can list the remote directory, you can use
it to list your three sets of files:


startupfiles = listdir(...StartupConfig)
runningfiles = listdir(...RunningConfig)
archivefiles = listdir(...ArchiveConfig)


now you can move onto step 2:

    "run through startup, running and archive to find files 
     with same hostname(do this for all files)"


Now you can forget all about remote file systems (at least for now) and just
work with the three lists of file names. How do you decide which files
belong to which file name? I don't know, because you don't say. Is the
hostname in the file name? Or do you have to open the file and read the
information from the file contents?

However you do it, start by generating a mapping of hostname: list of file
names.


mapping = {}
for filename in list_of_filenames:
    hostname = get_hostname(filename)
    if hostname in mapping:
        mapping[hostname].append(filename)
    else:
        mapping[hostname] = [filename]  # start a new list with one entry



Once you've done that for all three directories, then you can collect all
the file names for each host from all three directories, and compare the
files.


I leave the rest of the exercise to you, but remember the principle of
Divide And Conquer. Divide the problem into smaller problems which are easy
to solve, solve each small problem, and the big problem solves itself.



-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.




More information about the Python-list mailing list