[Tutor] how to read from a txt file

Kent Johnson kent37 at tds.net
Sun Feb 13 21:06:12 CET 2005


Brian van den Broek wrote:
> Kent Johnson said unto the world upon 2005-02-13 14:04:
>> Brian van den Broek wrote:
>>
>>> Since you files are quite short, I'd do something like:
>>>
>>> <code>
>>> data_file = open(thedata.txt, 'r') # note -- 'r' not r
>>> data = data_file.readlines()       # returns a list of lines
>>>
>>> def process(list_of_lines):
>>>     data_points = []
>>>     for line in list_of_lines:
>>>         data_points.append(int(line))
>>>     return data_points
>>>
>>> process(data)
>>
>>
>>
>> This can be done much more simply with a list comprehension using 
>> Python's ability to iterate an open file directly:
>> data_file = open('thedata.txt', 'r') # note -- 'thedata.txt' not 
>> thedata.txt :-)
> 
> 
> Gah! :-[   Outsmarting myself in public again. (At least I'm good at 
> something :-) )
> 
>> data_points = [ int(line) for line in data_file ]
> 
>> then process the data with something like
>> for val in data_points:
>>   # do something with val
>>   time.sleep(300)
>>
>> Alternately (and my preference) the processing could be done in the 
>> read loop like this:
>> data_file = open('thedata.txt', 'r')
>> for line in data_file:
>>   val = int(line)
>>   # do something with val
>>   time.sleep(300)
>>
>> Kent
> 
> 
> I do get that for the minimal logic I posted, this way is much simpler. 
> But, isn't my way with a separate function more easily extended? (To 
> deal with cases where there is more than just ints on lines, or where 
> the data needs to be similarly processed multiple times, etc.)

If the processing is per line, any of the three can be extended by calling a user function instead 
of int(), e.g.
def process_line(line):
   # do something with a line
   return val

data_points = [ process_line(line) for line in data_file ]

If you need to maintain some kind of state then the list comprehension breaks down and you might 
want to use
for line in f:
   # ...

or even a class like this:
http://mail.python.org/pipermail/tutor/2005-February/035582.html

If you need to process the list of lines multiple times in different ways then using readlines() is 
appropriate.

I tend to prefer solutions that make fewer intermediate lists, using iterators instead. This seems 
to be the modern Python style with the introduction of list comprehensions, generator functions, 
itertools, generator expressions...

Kent

> 
> I do feel a YAGNI coming on, though :-)

Seems appropriate :-)

Kent

> 
> Anyway, thanks for improving my attempt to help.
> 
> Best,
> 
> Brian vdB
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
> 



More information about the Tutor mailing list