[Tutor] Introductory questions on test-driven development and implementing Git version control.

Sat Apr 25 02:36:57 CEST 2015

So many questions... let's hope I don't miss any... :-)

On Fri, Apr 24, 2015 at 02:09:45PM -0500, boB Stepp wrote:

> First question: What testing modules/frameworks should I start out
> with? Doing a quick scan of the books I have, mention is made of
> doctest and unittest modules in the Python standard libraries. But
> mention is also made of two third party modules, nose and pytest.

Doctest is the simplest but less powerful. And besides, unless you are 
Tim Peters, the creator of doctest, nobody wants to read dozens and 
dozens of tests in a function docstring.

unittest is quite easy to use, and powerful. Some people complain that 
it is not very Pythonic, but I've never felt that. It is also based on a 
standard and much-copied Java library, so there are versions of unittest 
for many different languages. I quite like it.

I've never used nose, but from what I have seen of it, I will not like 
it. I understand that it extensively uses the assert statement for 
testing, which I believe is naughty: it is, I maintain, a misuse of 
assert. It's okay to use a few assert statements for quick and informal 
testing, but not for permanent professional testing. 

If nothing else, by using assert for your tests, you cannot possibly 
test your code when running under -O optimization.

I have no opinion on pytest.

> What
> would be the most productive use of my learning time while entering
> TDD waters for the first time? And beyond automating unit tests, how
> is integration testing done in an automated fashion? Will I require
> special software for this? And then there is user interface testing...

The different sorts of tests are separated by purpose, not necessarily 
form or usage. Particularly for small projects, it may not make sense to 
split them into separate test runs. It's okay to mix regression tests 
into your unit tests, and even integration testing, if they are simple 
and fast enough.

As for automated UI testing, that's hard unless you have a framework 
that is designed for that, something which can interact with the GUI 
controls and verify that they behave as expected. I have no idea about 
that.

In any case, you should be able to run all the tests (be they doc tests, 
unit tests, regression tests, etc.) from a single command. I like to set 
things up so that these will work:

python3 myproject/tests.py

python3 -m myproject.tests

If the integration tests are slow and complex, then you might have two 
test files, one which runs everything, and the other which runs 
everything but the integration tests. Run the quick tests frequently, 
and the slow tests less often.

> And what would be the best approach to integrating Git with these
> efforts? Just how often does one commit one's code to the version
> control system? Or do I have some GCEs (Gross Conceptual Errors) here?
> Can Git be set up to automatically keep track of my code as I create
> and edit it?

No, that's not how revision control works. You really don't want every 
time you hit save to count as a new revision. That would be ugly.

Joel Spolsky has a good introduction to Mercurial (hg). Git is slightly 
different, but the fundamentals are more or less equivalent:

http://hginit.com/
‎
You can also watch Git For Ages 4 And Up:

http://www.youtube.com/watch?v=1ffBJ4sVUb4

The executive summary of how I use version control:

- work on bite-sized chunks of functionality
- when the tests all pass, commit the work done
- push changes to the master repo at least once per day

The way I use version control on my own is that I have typically use a 
single branch. I rarely have to worry about contributions from others, 
so it's just my changes. Make sure that all the relevent files (source 
code, documentation, tests, images, etc.) are being tracked. Static 
files which never change, like reference materials, should not be.

Starting from a point where all the tests pass, I decide to work on a 
new feature, or fix a bug. A feature might be something as small as "fix 
the documentation for this function", but *not* as big as "control 
remote controlled space ship" -- in other words, a bite-sized chunk of 
work, not a full meal. I write some tests, and write the minimal amount 
of code code that makes those tests pass:

- write tests
- save tests
- write code
- save code
- run tests
- fix bugs in tests
- save tests
- write some more code
- save code
- run tests again
- write some more code
- save code
- run tests again

etc. Once the tests pass, then I have a feature and/or bug fix, and I 
commit all the relevent changes to the VCS. hg automatically tracks 
files, git requires you to remind it every single time what files are 
being used, but either way, by the time I run `hg commit` or `git 
commit` I have a complete, and hopefully working, bite-sized chunk of 
code that has an obvious commit message:

"fix bug in spam function"
"correct spelling errors in module docs"
"rename function ham to spam"
"change function eggs from using a list to a dict"
"move class K into its own submodule"

etc. Notice that each change is small enough to encapsulate in a short 
description, but big enough that some of them may require multiple 
rounds of back-and-forth code-and-test before it works.

I run the tests even after seemingly innoculous changes to comments or 
docstrings, especially docstrings. Edits to a docstring may break your 
doctests, or even your code, if you accidentally break the quoting.

Then, when I am feeling satisfied that I've done a sufficiently large 
amount of work, I then push those changes to the master repo (if any). 
This allows me to work from various computers and still share the same 
code base. "Sufficiently large" may mean a single change, or a full 
day's work, or a whole lot of related changes that add up to one big 
change, whatever you prefer. But it shouldn't be less than once per day.

> And as to automated testing: I really, ..., really would like to
> implement it on my side projects at work. But all such programs start
> in a proprietary scripting environment, which can call external Python
> (or other languages) scripts. The resulting final program is almost
> always an unavoidable amount of propriety scripting language (Which I
> always strive to minimize the amount of.), Solaris shell
> commands/scripts and Python. As I have been learning more Python and
> implementing it at work, I have found that the most successful
> approach seems to be to first get all of the information I need out of
> the CSA (commercial software environment) upfront, save it in temp
> files, then call a Python script to start the heavy duty processing,
> do everything possible in Python, generate a "reload" script file that
> contains language the CSA understands, and finally run that inside the
> CSA. How do I go about automating the testing of something like this?
> And apply TDD write tests first principles?

TDD principles apply to any programming language. So long as the 
language that you use to write the tests has ways of calling your shell 
scripts and CSA code, and seeing what results they get, you can test 
them.

For example, I might write a test which calls a shell script using 
os.system, and checks that the return result is 0 (success). And a 
second test that confirms that the shell script actually generates the 
file that it is supposed to. A third test to confirm that it cleans up 
after itself. Etc.

> And I would like to have all of that under version control, too. But
> while I am allowed to write my own programs for this CSA, I am not
> allowed to install anything else, strange as this may sound! Since the
> only functional editors in these bare-bones Solaris 10 environments
> are some simplistic default editor that I do not know the name of and
> vi, I long ago gravitated to doing my actual coding on my Windows PC
> (Being careful to save things with Unix line endings.) and FTPing to
> the environments where these programs will actually run. I AM allowed
> to install anything I want (within reason)on my PC. So I am thinking
> install and use Git there?

Yes.

> And if successful automated testing can be done with this CSA
> situation, how difficult is it to backtrack and add test suites to
> stuff already written and being used? Are there special strategies and
> techniques for accomplishing such a feat?

It's easy, especially for doc and unit tests, which is why I personally 
don't care too much about actually writing the tests first. For me, at 
least, it doesn't matter which gets written first.

However, I *religiously* write the documentation first. If I don't 
document what the function does, how will I know what the code should 
do or when I am finished?

-- 
Steve