[Edu-sig] Checking an assumption

Fri Dec 5 10:36:11 EST 2003

In a message of 04 Dec 2003 20:50:52 PST, "Josh English" writes:
>I'm a lurker on this list. I joined because I was on track to become a
>high school level mathematics teacher and I use Python frequently. I am
>currently a graduate student and I'm working on a final project for the
>term. I am creating a rough draft of a curriculum that would teach
>Statistics and Python in the same class. I'm basing this off of two assum
>ptions:
>1) Students in a statistics class do not need the practice of repetitive 
>operations.
>2) The most effective way to learn something is to teach it, and
>programming is functionally equivalent to teaching a subject.
>
>I think that these assumptions are good enough to start with, but I'd
>like to hear some other opinions about them. Is programming in any
>language the functional equivalent to teaching a procedural method?
>
>Thanks for any advice,
>Josh English
>english at spiritone.com

Your assumptions are not universally true -- though they certainly are
in many cases.  So you need to check to make sure that they hold true
for your case.  I have a few questions.  First, what country are you
in?  Second, what is the age of the intended students?  Third, how
sophisticated is the statistics you wish to teach?  I am going to
outline here under what conditions I think that what you intend to do
may be a mistake, and cause problems for you.  This isn't to be taken
as a rejection of your plan -- I think that it sounds like a lot of
fun, and will be an excellent way to teach the correct bunch of
students, which I hope you have.  Because in that case, it will be
great.

Assumption one sets you clearly on the path of _education_ rather than
_training_.  (Training is pretty much all about repetitive action.)
When _educating_ works, you need students who are able to memorize new
chunks of abstract knowledge, and then know exactly where to store and
apply it in the knowledge structures which they have in their own
brains.  This will only happen if your students already have a nice
collection of mathematical knowledge, and enough mathematical
intuition to know how and where to add new stuff.

If they don't, then either a) they won't be able to memorize the
abstract knowledge, 'It's all words, and it doesn't really mean
anything' or, for the good memorisers -- they will know the
definitions word perfect and still not have a clue what it means.  For
this reason, when I am busy teaching elementary school children the
difference between a mean, a median, and a mode I make them draw some
stick figures and then measure legs and arms and torsos and the lot.
This is because 'means' and 'medians' are not about _words_ and not
about _numbers_, but about _measuring_,  _counting_ and
_populations_.  The students I get understand counting, but not
measuring yet, and have drastic problems understanding populations.
If you don't train them how to measure, you end up with a class of
parrots who can manipulate the numbers to get the correct answer when
you ask them 'calculate the mean of this series of numbers' but who
cannot figure out if their allowance is in line with what is usual for
their peers, (if there is only one person in the class that gets more
allowance than they do, it is still _unfair_, because _she gets
more_), or even whether something is 'more common' or not -- if you
hand them a series of 30 numbers from one to ten with '4' repeated 8
times in a row and '9' occuring 10 times, but never in a row, about
half the class will insist that '8' is more common than '9' _even
after you count them_.  It is only after you hand people the actual
numbers on tiny pieces of paper (or cutout of paper) and actually make
piles of 8s and 9s that they can see which is 'more common'.  (And
even then you will get some holdouts, who think that '8' is more
common, because however you slice your series, you can never get a
subset which has more 9s in it than the other numbers, and there are
lots of ways to slice it where you get more 8s than the other numbers.)

I assume that your students will be older than this.  Indeed, I assume
that they either a) already know how to program, or b) the statistics
that you wish to teach them are the sort that you can do in a trivial
amount of Python code -- the sort of thing that you can just type into
a Python interpreter.  If you don't have this sort of match, then
what you will end up doing is teaching your students how to program.
Now a course 'teaching programming using basic statistical methods'
might be a lot of fun -- though my experience is that 'teaching
programming using problems that involve words works better' but if
the stated aim of the course is to teach statistics, you may get in
trouble with your department.

You also need to check to see if they are supposed to be learning
mathematical fundamentals, or whether symbolic manipulation will be
enough.  Something to be concerned about is how this course is
supposed to fit into the general pattern of their mathematical
education.  Some places use statistics courses in order to teach
students how to make graphs with pen and pencil.  The actual
statistical knowledge imparted is relatively minor.  What is important
is to get the ability to visualise data.  And to learn that, you have
to draw it.  Learning how to read graphs is not enough, and learning
how to write a computer program that draws graphs is also not enough.
You actually have to make a bunch of drawings, again and again and
again before, given a series, or an equation, you can actually have a
geometrical sense of what is going on.  (And some people never get it,
no matter what you do.  It is quite frustrating.)  Check and make sure
that this is not what really is supposed to be taught in that class
before you replace the pen and pencil with Python, because in that
case your assumption 1 is shot, and you are supposed to be training
them.

The next thing to check is whether the intellectual effort required to
make the program is spent on learning the formulas you would like the
students to learn (assuming this is what you are up to).  It is quite
easy to get a mismatch.  For instance, a smart student might build a
really nice program for making very pretty bar charts, working and
polishing on this for days, weeks, months.  The actual statistical
equations that are used to graph are almost an afterthought -- you
crack open your text book, and convert the formula into python in a
straightforward mechanical way, and that takes you all of 20 minutes.
You can now do a whole family of equations, at roughly 20 minutes an
equation.  Ooops.  That student isn't learning statistics, as in the
formulae, he or she is learning how to quickly rewrite formulas in
Python.  (Which is what I do all the time, and is the reason why I
always have to look them up.  I never actually _know_ them.)  This is
a really useful and valuable skill, but probably not what you are
trying to teach.

This problem can work in the other direction as well.  For instance,
you will have to give them a nice lecture on 'What is Floating Point,
When can you use it, and when must you never use it.'  I'd dearly
love that lesson taught to high school students, world wide.  But
it is tough going, and my gut feeling is that it will not be easy
to teach this one until your students have an extremely good sense
of what experimental error is.  If they don't have that background,
then they may confuse floating point error with experimental error
which is a particularly nasty misconception to uproot later.  

Also, when building a computer program to do some sort of problem in
numerical analysis, you may find that the bulk of the program is spent
doing corner cases and handling particularly nasty sorts of data.
Writing this code -- at least the way I do it -- is a matter of
writing a bunch of unit tests and building the code so that it handles
all the perverse cases.  But first, your students may be too naive to
see the perverse cases -- this is where they are being first exposed
to them, after all, and second, in a basic statistics class, you may
want the students to focus on how things go when your data is
well-behaved, and doesn't provide any problem for the programmer.  So
there the goals of test-driven design and learning the statistical
methods may work at cross-purposes.

But, provided these problems do not raise their ugly heads, it looks
like a lot of fun, and could be the sort of class where you actually
get a taste of 'what life has to offer' rather than 'what school has
to offer'.  Sounds good to me.

We're having an Education Track at EuroPython June 7-9 2004 in Göteborg,
Sweden.  (And if you are not in Europe, travelling to Europe is
probably cheaper than you think.)  Come give a talk and  let us know
how things are going.  That goes for the rest of you, as well.  The
education track chairman is Steve Alexander <steve at z3u.com>, but we
are discussing such things on the EuroPython mailing list now --
http://mail.python.org/mailman/listinfo/europython

Laura  Creighton