Strategy/ Advice for How to Best Attack this Problem?

Peter Otten __peter__ at web.de
Fri Apr 3 06:45:17 EDT 2015


Saran A wrote:

> I debugged and rewrote everything. Here is the full version. Feel free to
> tear this apart. The homework assignment is not due until tomorrow, so I
> am currently also experimenting with pyinotify as well.

Saran, try to make a realistic assessment of your capability. Your 
"debugged" code still has this gem

>    while True:
>        time.sleep(1) #time between update check

and you are likely to miss your deadline anyway.

While the inotify approach may be the "right way" to attack this problem 
from the point of view of an experienced developer like Chris as a newbie 
you should stick to the simplest thing that can possibly work.

In a previous post I mentioned unit tests, but even an ad-hoc test in the 
interactive interpreter will show that your file_len() function doesn't work

> #returns length of file
> def file_len(f):
>     with open(f) as f:
>         for i, l in enumerate(f):
>             pass
>             return i + 1
> 

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> #returns length of file
... def file_len(f):
...     with open(f) as f:
...         for i, l in enumerate(f):
...             pass
...             return i + 1
... 
>>> with open("tmp.txt", "w") as f: f.write("abcde")
... 
>>> file_len("tmp.txt")
1

It reports size 1 for every file. 

A simple unit test would look like this (assuming watchscript.py is the name 
of the script to be tested):

import unittest

from watchscript import file_len

class FileLen(unittest.TestCase):
    def test_five(self):
        LEN = 5
        FILENAME = "tmp.txt"
        with open(FILENAME, "w") as f:
            f.write("*" * LEN)

        self.assertEqual(file_len(FILENAME), 5)
        
if __name__ == "__main__":
    unittest.main()

Typically for every tested function you add a new TestCase subclass and for 
every test you perform for that function you add a test_...() method to that 
class. The unittest.main() will collect these methods, run them and generate 
a report. For the above example it will complain:

$ python test_watchscript.py 
F
======================================================================
FAIL: test_five (__main__.FileLen)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_watchscript.py", line 12, in test_five
    self.assertEqual(file_len(FILENAME), 5)
AssertionError: 1 != 5

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)


The file_len() function is but an example of the many problems with your 
code. There is no shortcut, you have to decide for every function what 
exactly it should do and then verify that it actually does what it is meant 
to do. Try to negotiate an extra week or two with your supervisor and then 
start small:

On day one make a script that moves all files from one directory to another.
Make sure that the source and destination directory are only stored in one 
place in your script so that they can easily be replaced. Run the script 
after every change and devise tests that demonstrate that it works 
correctly.

On day two make a script that checks all files in a directory. Put the 
checks into a function that takes a file path as its only parameter.
Write a few correct and a few broken files and test that only the correct 
ones are recognized as correct, and only the broken ones are flagged as 
broken, and that all files in the directory are checked.

On day three make a script that combines the above and only moves the 
correct files into another directory. Devise tests to verify that both 
directories contain the expected files before and after your script is 
executed.

On day four make a script that also moves the bad files into another 
directory. Modify your tests from the previous day to check the contents of 
all three directories.

On day five wrap your previous efforts in a loop that runs forever.
I seems to work? You are not done. Write tests that demonstrate that it does 
work. Try to think of the corner cases: what if there are no new files on 
one iteration? What if there is a new file with the same name as a previous 
one? Don't handwave, describe the reaction of your script in plain English 
with all the gory details and then translate into Python.

Heureka!

Note that a "day" in the above outline can be 15 minutes or a week. You're 
done when you're done. Also note that the amount of code you have written 
bears no indication of how close you are to your goal. An experienced 
programmer will end up with less code than you to fulfill the same spec. 
That shouldn't bother you. On the other hand you should remove code that 
doesn't contribute to your script's goal immediately. If you keep failed 
attempts and side tracks your code will become harder and harder to 
maintain, and that's the last thing you need when it's already hard to get 
the necessary parts right. 

Good luck!





More information about the Python-list mailing list