Refactoring a generator function

max maxume at yahoo.com
Sat Dec 4 12:48:41 EST 2004


Kent Johnson <kent3737 at yahoo.com> wrote in
news:41b1d509$1_3 at newspeer2.tds.net: 

> Here is a simple function that scans through an input file and
> groups the lines of the file into sections. Sections start with
> 'Name:' and end with a blank line. The function yields sections
> as they are found.
> 
> def makeSections(f):
>      currSection = []
> 
>      for line in f:
>          line = line.strip()
>          if line == 'Name:':
>              # Start of a new section
>              if currSection:
>                  yield currSection
>                  currSection = []
>              currSection.append(line)
> 
>          elif not line:
>              # Blank line ends a section
>              if currSection:
>                  yield currSection
>                  currSection = []
> 
>          else:
>              # Accumulate into a section
>              currSection.append(line)
> 
>      # Yield the last section
>      if currSection:
>          yield currSection
> 
> There is some obvious code duplication in the function - this bit
> is repeated 2.67 times ;-): 
>              if currSection:
>                  yield currSection
>                  currSection = []
> 
> As a firm believer in Once and Only Once, I would like to factor
> this out into a separate function, either a nested function of
> makeSections(), or as a separate method of a class
> implementation. Something like this:
> 
> 
> The problem is that yieldSection() now is the generator, and
> makeSections() is not, and the result of calling yieldSection()
> is a new iterator, not the section... 
> 
> Is there a way to do this or do I have to live with the
> duplication? 
> 
> Thanks,
> Kent
> 
>

This gets rid of some duplication by ignoring blanklines altogether, 
which might be a bug...

 def makeSections2(f):
    currSection = []
    for line in f:
	line = line.strip()
	if line:
	    if line == 'Name:':
		if currSection:
		    yield cs
		    currSection = []
	    currSection.append(line)
    if currSection:
	yield currSection

but 

 def makeSections2(f):
    currSection = []
    for line in f:
	line = line.strip()

	if line:
	    if line == 'Name:':
		if currSection:
		    yield currSection
		    currSection = []
	    currSection.append(line)

    	elif currSection:
    	    	yield currSection

    if currSection:
	yield currSection

should be equivalent.



More information about the Python-list mailing list