I'm an idiot

Bengt Richter bokr at oz.net
Sat Jun 29 05:16:44 EDT 2002


On Sat, 29 Jun 2002 01:00:22 GMT, David <david_griswold1 at yahoo.com> wrote:

>OK, I am the first to admit it.  I am an idiot.  I have RTFM on this over 
>and over, and I can still not figure out what I am doing wrong.
>
IMHO the key thing to learn from all the responses is not the solution to the
particular problem, but some hints as to how you can figure things out for yourself.

That is one of the nice things about Python. You can effectively ask it what is
happening as you walk through your problem interactively.

>I think the intent of the code is obvious, but just to clarify, I want to 
>read every line in a file and write those lines back out to another file, 
>but with leading and training space removed.  I also want to have some 
>elegant way to determine that I have reached the end of the file and 
>break out of the loop.
>
>And just to explain my stupidity, my reference langauge is BASIC.  I 
>could have written this in BASIC in a minute, but I need to learn 
>something new.  Any help would be appreciated.
>
>David
>
>
>f=open('c:\\temp\\temp.txt', 'r')
>g=open('c:\\temp\\temp1.txt', 'w')
>while 1:
>    try:
>        s=f.readline
>        g.write(s.split())
>    except IOError:
>        break
>
>g.close
>f.close

Ok, why doesn't that work, you probably asked yourself.
First thing, did you have a known temp.txt file? You'd want
one that will demonstrate that things are working, so make one up,
and store it where you are specifying it to be e.g.:

--< temp.txt >--
   Three leading spaces and not trailing.
Second line with four trailing spaces.    
Third line has no spaces around it, and has blank line following.

	Last line, with leading tab and ending with newline.
--

So first thing is opening the file
 >>> f=open('c:\\temp\\temp.txt', 'r')
Now what is f at this point? You can check to make sure:
 >>> f
 <open file 'c:\temp\temp.txt', mode 'r' at 0x007CF3A0>

Well, that seems to have worked.

Now before putting it in a loop, let's just try the read statement:
 >> s=f.readline

You expect a string, so check to make sure:
 >>> s
 <built-in method readline of file object at 0x007CF3A0>
Aha. Not what you expected. Most built-in stuff has doc strings, so here you can
print either s.__doc__ or f.readline.__doc__, since they both refer to the
same thing at this point:
 >>> print f.readline.__doc__
 readline([size]) -> next line from the file, as a string.

 Retain newline.  A non-negative size argument limits the maximum
 number of bytes to return (an incomplete line may be returned then).
 Return an empty string at EOF.

We want the whole line here, and we'll trust that input lines are
reasonable length, so we'll want f.readline(), which we can test:
(and we'll pretend we didn't see that bit about EOF)

 >>> s = f.readline()
What did we get?
 >>> s
 '   Three leading spaces and not trailing.\n'
Looks good so far. Now what about s.split() ? Let's see:

 >>> s.split()
 ['Three', 'leading', 'spaces', 'and', 'not', 'trailing.']

Hm, that doesn't look ready to write out. Must be another string method
that does the job. dir(s) will tell us what other s.xxx methods there are
associated with s (here s is bound to a string, so we'll see string methods):

 >>> dir(s)
 ['__add__', '__class__', '__contains__', '__delattr__', '__eq__', '__ge__', '__getattribut
 e__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__'
 , '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__repr__', '__rmul__', '__setat
 tr__', '__str__', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expand
 tabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle',
 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'replace', 'rfind', 'rindex', 'rjust', 'rst
 rip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upp
 er']

Whoa, that line wrapped. lstrip, rstrip, and strip look like the obvious candidates, with
the last likely to do the job. So check it out:

You can print the docstring info:
 >>> print s.strip.__doc__
 S.strip() -> string

 Return a copy of the string S with leading and trailing
 whitespace removed.

Or with >= version 2.1 you can:
 >>> help(s.strip)
 Help on built-in function strip:

 strip(...)
     S.strip() -> string

     Return a copy of the string S with leading and trailing
     whitespace removed.

Ok, let's make sure it works:
 >>> s.strip
 <built-in method strip of str object at 0x007A1E30>
Oops. We know about that one now. Need a () to execute the method ...
 >>> s.strip()
 'Three leading spaces and not trailing.'

s is still there for comparison, since we have not rebound it with s=s.strip() or something:
 >>> s
 '   Three leading spaces and not trailing.\n'

Well, the leading spaces diasppeared ok, but note that the newline also disappeared with .strip().
The virtue of trying things out ;-) So if we want to write out the stripped strings
as lines, we'll have to add the '\n' back on.

Perhaps we are ready to try that:

 >>> g=open('c:\\temp\\temp1.txt', 'w')
 >>> g
 <open file 'c:\temp\temp1.txt', mode 'w' at 0x0083D0A0>

Make sure it's what we think:
 >>> s.strip()+'\n'
 'Three leading spaces and not trailing.\n'
Ok, write it
 >>> g.write(s.strip()+'\n')

No complaint, so we might as well do the rest:

 >>> s = f.readline()
 >>> s
 'Second line with four trailing spaces.    \n'
 >>> s.strip()+'\n'
 'Second line with four trailing spaces.\n'
 >>> g.write(s.strip()+'\n')
 >>> s = f.readline()
 >>> s
 'Third line has no spaces around it, and has blank line following.\n'
 >>> s.strip()+'\n'
 'Third line has no spaces around it, and has blank line following.\n'
 >>> g.write(s.strip()+'\n')
 >>> s = f.readline()
 >>> s
 '\n'
 >>> s.strip()+'\n'
 '\n'
 >>> g.write(s.strip()+'\n')
 >>> s = f.readline()
 >>> s
 '\tLast line, with leading tab and ending with newline.\n'
 >>> s.strip()+'\n'
 'Last line, with leading tab and ending with newline.\n'
 >>> g.write(s.strip()+'\n')
 >>> s = f.readline()
 >>> s
 ''
Oops, what was that?
 >>> help(f.readline)
 Help on built-in function readline:

 readline(...)
     readline([size]) -> next line from the file, as a string.

     Retain newline.  A non-negative size argument limits the maximum
     number of bytes to return (an incomplete line may be returned then).
     Return an empty string at EOF.

Aha. (we should've remembered about EOF from before ;-)
 >>> f.close
 <built-in method close of file object at 0x007A5180>
;-)
 >>> f.close()
 >>> g.close()

Check results:
 >>> h=open('c:\\temp\\temp1.txt', 'r')
 >>> h.read()
 'Three leading spaces and not trailing.\nSecond line with four trailing spaces.\nThird lin
 e has no spaces around it, and has blank line following.\n\nLast line, with leading tab an
 d ending with newline.\n'

That wrapped, but looks ok, so now use the knowledge gained to revise your original.

My point here is just to show that you need not be at a loss when a program as
a whole doesn't work. Just break it down and check on what your code is actually
doing. After a while, just a well-placed print statement or two to check on
intermediate results will usually tell you enough.

To do that, you probably want to use an editor to edit a file representing the program,
and then either run it from the editor, if your editor supports that, or run it from
a separate command line window, and switch back and forth between editor and cmd line window.

When you want to be even more systematic about testing your code, you can set up automatic
testing to verify expected results. This is prudent once things get past trivial and you
need to make sure evolving improvements don't break existing functionality. There are modules
to help with that (see doctest & unittest).

You can always keep an extra console window open with python running to try little snippets
and type out help docs etc. Python makes it easy to explore what's really happening.
Then if you're stumped, you can post an excerpt from a session showing what you tried. ;-)

HTH

Regards,
Bengt Richter



More information about the Python-list mailing list