while c = f.read(1)

Fri Aug 19 02:18:12 EDT 2005

Greg McIntyre wrote:
> I have a Python snippet:
> 
>   f = open("blah.txt", "r")
>   while True:
>       c = f.read(1)
>       if c == '': break # EOF

That could read like this
if not c: break # EOF
# see below for comments on what is true/false

>       # ... work on c
> 
> Is some way to make this code more compact and simple? It's a bit
> spaghetti.

Not at all, IMHO. This is a simple forward-branching exit from a loop in 
explicable circumstances (EOF). It is a common-enough idiom that doesn't 
detract from readability & understandability. Spaghetti is like a GOTO 
that jumps backwards into the middle of a loop for no discernable reason.

> 
> This is what I would ideally like:
> 
>   f = open("blah.txt", "r")
>   while c = f.read(1):
>       # ... work on c
> 
> But I get a syntax error.
> 
>     while c = f.read(1):
>            ^
> SyntaxError: invalid syntax
> 
> And read() doesn't work that way anyway because it returns '' on EOF
> and '' != False. >

You have a bit of a misunderstanding here that needs correcting:

In "if <blah>" and "while <blah>", <blah> is NOT restricted to being in 
(True, False). See section 5.10 of the Python Reference Manual:

"""
In the context of Boolean operations, and also when expressions are used 
by control flow statements, the following values are interpreted as 
false: None, numeric zero of all types, empty sequences (strings, tuples 
and lists), and empty mappings (dictionaries). All other values are 
interpreted as true.
"""

... AND it's about time that list is updated to include False explicitly 
  -- save nitpicking arguments about whether False is covered by 
"numeric zero of all types" :-)

> If I try:
> 
>   f = open("blah.txt", "r")
>   while (c = f.read(1)) != '':
>       # ... work on c
> 
> I get a syntax error also. :(
> 
> Is this related to Python's expression vs. statement syntactic
> separation? How can I be write this code more nicely?
> 
> Thanks
> 

How about
    for c in f.read():
?
Note that this reads the whole file into memory (changing \r\n to \n on 
Windows) ... performance-wise for large files you've spent some memory 
but clawed back the rather large CPU time spent doing f.read(1) once per 
character. The "more nicely" factor improves outasight, IMHO.

Mild curiosity: what are you doing processing one character at a time 
that can't be done with a built-in function, a standard module, or a 
3rd-party module?