[Tutor] Writing the right code rite

Avi Gross avigross at verizon.net
Mon Nov 26 19:18:46 EST 2018


I am here to learn but also to help out when I already know something that
is asked.

 

What has been of some concern to me is how to reply when the asked has not
shared with us lots of info.

 

If I see them using "print" without parentheses, I can assume they are not
using 3.x and perhaps emulate their usage. Showing them samples of code that
won't work in their environment is a bit of a waste of time, theirs and
mine.

 

But in watching, I conclude that a subset of the requests come from people
who are some form of early students and only aware of selected aspects of
the language. For some of them uses of words and phrases is a tad loose so
when they say "list" they might mean something that can range from what
others might call tuples or lists or dictionaries or sets or reaching a bit
beyond to other words like vectors, arrays, matrices, bags, collections,
name spaces and so on.

 

What many want is for someone to help them write some kind of simple loop.
Ideally, they would state the problem clearly enough and try to write code
and ask help when it does not seem to work. If they answer a few questions
to clarify, they can often get a simple answer they can use or understand.
But it might help to steer them otherwise.

 

We have had recent requests for example for what sounded like a need to read
text into words and then keep adding them to a 'list" if not already there.

 

What kind of answers have people provided?

 

What many of the answers would have in common is to break the lines up into
tokens we consider words. But even that can be a complex task that can be
done many ways depending on what the file might contain and whether words
can include an apostrophe or hypen. Some solutions might tokenize a line at
a time and some do the entire file at once using some regular expression
that matches anything after one or more non-word characters till the next
series of one or more non-word characters. Others might just split a line on
a single space which likely won't work on normal English text containing
punctuation and so on.

 

Similarly, some may suggest tossing all the tokens into a set to get unique
results. Others may suggest using a dictionary and couniting how many times
each word appears (even if that was not required.) Others may use lists and
introduce use of whys to check if something is already in the list or extend
the list. Some may suggest using a data structure like a dequeue for reasons
I am not able to explain. 

 

Others may want to use a list and carefully explain how to search in an item
is already in a list and how to extend the list. Some may even want them to
keep the list in alphabetical order. Some may want to dump all words in list
1 then iterate on it by adding only words not already in list2 to list2.
Some may want to use a binary tree that finds thing in O(log(N) or
something. And yes, some would like you to make a long list then sort it and
apply a function such as an iterator that returns the next item after
finding all consecutive items that are the same.

 

There are an amazing number of fairly reasonable ways to solve problems,
with the usual tradeoffs.

 

But if someone is new to this, what answer meets their immediate needs? For
most school projects, unless they are told to use some module, the
expectation is for something fairly simple without worrying that the small
dataset used will run very long or use much memory.

 

And, the solution probably should not be deeply nested in ways hard to read.
I recently wrote a function for someone (not on this group) with about a
dozen lines of code that did things step by step using variables with
somewhat meaningful names. In English, the goal was to take in a DataFrame
and only keep those rows where some subset of the columns all had valid
data. The long version was easy to understand. It did things like make a
narrow copy of the data using just that subset of the columns. Then it
applied a function that returned a vector of True/False if the narrow rows
were all OK or had anything missing. Then that vector was used to index the
original table to select only rows where it was true. But all that could be
collapsed into a single line with a single but somewhat nested statement as
each step sort of feeds into being an index of another step. But I would not
want that kind of code around without paragraphs of comments and in this
case, what difference does it make if temporary variables had a name before
being garbage collected when the function returns?

 

Since I find I am more interested in examining different ways to solve a
problem with some thought on what is better or just easier or more elegant
and so on, I am not always a best fit for instructing new students unless
they are sitting in a class I am teaching and mostly at the same level. This
is an open forum where we often have no clue what the student already knows
or has tried and we do not encourage them to post more than a minimal
example. But we can probably assume most just want to get something done,
not create functions and objects they could reuse .

 

I know it slows things down, especially with a moderator, but often the best
answer is to ask some questions before trying to supply an answer. Perhaps a
rapid email exchange directly with a student, perhaps moving on to instant
messaging or phone or video forms of communication would work better for
those interested. When done, you might post a summary if appropriate for
others interested but note showing a full solution can be unfair if other
students working on the same problem just latch on to that.

 

 



More information about the Tutor mailing list