[Tutor] Readlines(longish!)

Alan Gauld alan.gauld@blueyonder.co.uk
Thu Jun 26 18:49:02 2003


> I know you're the expert Alan, but you might not have left high
level
> concepts for the moment.

I'm not really that much of an expert on the innards of Python
but for this discussion that doesn't really matter...

> Just that, suppose you only have four months in a year. The computer
> would probably have:
>
> 00 00- January.
> 01 01- February.
> 02 10- March
> 03 11- April

Nope, I disagree. You are thinking of low level languages like
C I suspect. When all youi have is an array everything looks
like an array... But in Python you would probably store the
months in a dictionary - which under the covers is implemented
as a sparce array or a hash table or a balanced tree. In none
of those cases is there any meaningful sequence number.

In fact if you inserted a new item into a balanced tree
implementation or hash the entire sequencing may well change!
But the foreach construct will still work just as well.

> It won't bother thinking about a February. If it counts through the
> months, it won't conceive of time, metaphors, or abstract items.
Just
> add one each time.  It will have to do that.

No it doesn't have to do that. In fact even in a primitive language
like C it could implement it as a linked list. In that case it
would have to search through the list looking at the content, it
wouldn't use any kind of numerical index. You only need an index
when dealing with array like structures.

Consider the C like pseudo code:

struct {
   char name[20];
   month* next;
   }month;

char* months[] = {"Jan","Feb",...."Dec"};

year *month;
year = new month;
strcpy(year->name,months[0]);

for (m=1;m<12;m++){
    m = new month;
    strcpy(m->name,months[m])
    m->next = NULL;
    year->next = m;
    year = m;
    }

Now we had to use index8ing to set that up because of the limitations
of C. But when it comes to processing the months later on, we just do:

n = year;
while (n){
    puts(n->name);
    n = n->next;
    }

Now this is the behaviour we get from Python's for loop. It iterates
over the sequence without a care for its position. In fact its
actually better than that because it uses an iterator mechanism
which means it doesn't even know about whetyer the sequence is
an array, a string, a dictionary or a user defined balanced AVL
tree or whatever. So long as the collection keeps feeding it
a "next" then the for loop keeps processing things.

> You're adding LineCount wasn't really more confusing.
> But, you might to do that but be less descriptive, and say 'lc' or
'l'.
> And you won't know what it was later. Think so?

Bad variable naming will cause confusion regardless of the
language! :-)

> If you just have 'line' you could use it both for the printed line,
> and the number of that line. Both fit well trying to read like plain
> English.

I originally used line then realized it was amibigous so changed
it to lineCount. Careful name selection is part of good programming.

> Are you sure you've never added counters before in lots of places
> around, and then left very little working code on one screen/page?

Not when using a for loop. If I need to access the index I will
usually wind up using the while loop which is deliberately
flexible and lower level than the for loop.

> >> Consider the common case of processing a file with a while loop
> >> until you reach the end of file... You don't need to track the
> >> line number.
>
> Probably not, if there are functions to read from the start, or end
of
> a file.

Or as in the case of Python to just keep reading the next line
until there's nothing left. The xreadlines() method in Python
can be simulated using a while loop just about as easily:

line = f.readline()  # note single line
while line:          # if line is not empty
    print line
    line = f.readline()  # next!

Note the loop doesn't check for start or end of file it just
keeps reading till there's nothing returned.

> >> When you have to add extra variables, it just adds more text on
your
> >> screen, and less space to review the main stuff you've actually
> >> written.

Absolutely, thats why FOR doesn't mess with indexes, it just
hands you the object to process and you leave the access to
Python to worry about. IN practice you are far more likely to
not need the index than you are to need it, so Pythons approach
minimises the spurious code.

If you don't find that to be the case consider whether you are
using FOR loops when maybe you should be using a while. Or maybe
you aren't taking advantage of the full power of pythons FOR loop?

> >> For the relatively few cases where you need an index counter then
> >> the amount of extra text is small, and you have the choice of
using
> >> a while loop, or for loop. Mostly when you use a for loop you
just
> >> want to process each item.
>
> Except when you're debugging, you probably want to print just about
> anything.

Again I disagree. When I'm debugging I print the little that I need.
If I need to examine a complete object, say, I will normally use
the debugger for that.

> Like when a function doesn't seem to be running correctly.

Learn to use the debugger, it makes life so much easier. I
typically only use print statements to print the entry into a
function, the input values and return value(s) of a function.
If they don't match then I first will start the interpreter
and test the functoins behaviour using the >>> prompt, if that
doesn't help I start the debugger and step through the function
line by line examining as I go.

print statements are powerful debugging tools but they are a
very blunt instrument and can obfuscate as much as they reveal
if overused.

Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld