Iterate over text file, discarding some lines via context manager

Dave Angel davea at davea.name
Fri Nov 28 11:15:32 EST 2014


On 11/28/2014 11:01 AM, Ned Batchelder wrote:
> On 11/28/14 10:22 AM, Dave Angel wrote:
>> On 11/28/2014 10:04 AM, fetchinson . wrote:
>>> Hi all,
>>>
>>> I have a feeling that I should solve this by a context manager but
>>> since I've never used them I'm not sure what the optimal (in the
>>> python sense) solution is. So basically what I do all the time is
>>> this:
>>>
>>> for line in open( 'myfile' ):
>>>      if not line:
>>>          # discard empty lines
>>>          continue
>>>      if line.startswith( '#' ):
>>>          # discard lines starting with #
>>>          continue
>>>      items = line.split( )
>>>      if not items:
>>>          # discard lines with only spaces, tabs, etc
>>>          continue
>>>
>>>      process( items )
>>>
>>> You see I'd like to ignore lines which are empty, start with a #, or
>>> are only white space. How would I write a context manager so that the
>>> above simply becomes
>>>
>>> with some_tricky_stuff( 'myfile' ) as items:
>>>      process( items )
>>>
>>
>> I see what you're getting at, but a context manager is the wrong
>> paradigm.  What you want is a generator.   (untested)
>>
>> def mygenerator(filename):
>>      with open(filename) as f:
>>          for line in f:
>>              if not line: continue
>>              if line.startswith('#'): continue
>>              items = line.split()
>>              if not items: continue
>>              yield items
>>
>> Now your caller simply does:
>>
>> for items in mygenerator(filename):
>>        process(items)
>>
>>
>
> I think it's slightly better to leave the open outside the generator:
>
> def interesting_lines(f):
>      for line in f:
>          line = line.strip()
>          if line.startswith('#'):
>              continue
>          if not line:
>              continue
>          yield line
>
> with open("my_config.ini") as f:
>      for line in interesting_lines(f):
>          do_something(line)
>
> This makes interesting_lines a pure filter, and doesn't care what sort
> of sequence of strings it's operating on.  This makes it easier to test,
> and more flexible.  The caller's code is also clearer in my opinion.
>
Thank you, I agree.  I was trying to preserve the factoring that the OP 
had implied.  I notice you also factored out the split.

> BTW: this example is taken verbatim from my PyCon presentation on
> iteration, it you are interested: http://nedbatchelder.com/text/iter.html
>

Thanks for the link.  I've started reading it, and I'll definitely read 
the whole thing.

-- 
DaveA



More information about the Python-list mailing list