[Tutor] design question -- nested loops considered harmful?

Tue Nov 30 08:13:11 CET 2004

>  > def pars_file(list_of_lines):
>  > ####internal FunctionS###########
>  >     def check_flags(line,flags=item_flags,adict=data_dict):
>  >        for item in flags:
>  >           if line.startswith(item):
>  >              adict[item]=line[len(item)]
>  >              return
>  > ####end internal functions####
>  > ####start main function suite####
>  >     data_dict={}
>  >     for line in list_of_lines:
>  >        check_flags(line)
>  >     return data_dict

> If anyone could either explain the conventional wisdom or set straight
> my belief that distaste for nested defs is indeed widespread and well
> founded, I'd be grateful.

Hi Brian,

It's a funny thing for me: whenever I write a nested definition, I often
find that the embedded definition is often useful enough to be itself an
outer definition.  *grin*

It does seem that internal definitions are mostly avoided.  Part of this
might be because Python didn't have real lexical scope until the Python 2
series, but I also suspect it's just because they're just harder to test,
since they're only accessible from the scope of the surrounding function.

I hardly write inner definitions without being tempted to make them
generally useable.  For the function above, for example, I can't help but
see if a general definition might work:

###
def parse_tagged_line(line):
    """A general parser for "tagged" lines of the form:

        [tag_name] rest_of_line

    Returns a 2-tuple (tag_name, rest_of_line).
    If the line doesn't appear to be tagged, returns (None, line).
    """
    regex = re.compile(r"""
                          \[
                          ([^\]]*)    ## The tag
                          \]
                          (.*)$       ## followed by the end of the line
                         """, re.VERBOSE)
    if regex.match(line):
        return regex.match(line).groups()
    else:
        return (None, line)
###

This is a generalized version of the line-parsing inner loop code, but it
tries to parse anything that looks like a tagged line.  For example:

###
>>> parse_tagged_line("[email] dyoo at hkn.eecs.berkeley.edu")
('email', ' dyoo at hkn.eecs.berkeley.edu')
>>> parse_tagged_line("[Name] Brian van den Broek")
('Name', ' Brian van den Broek')
>>> parse_tagged_line("This is just a regular line")
(None, 'This is just a regular line')
###

If we have 'parsed_tagged_line()', then the parse_file() function can be
flattened down a bit, from:

###
def parse_file(list_of_lines):
    data_dict = {}
    for line in list_of_lines:
        for item in item_flags:
            if line.startswith(item):
                data_dict[item] = line[len(item):]
                break
    return data_dict
###

to something like this:

###
def parse_file(list_of_lines):
    data_dict = {}
    for line in list_of_lines:
        tag, rest = parse_tagged_line(line)
        if tag in item_flags:
            data_dict[tag] = rest
    return data_dict
###

Embedded inner loops can be a ripe target for refactoring.  Sometimes, the
refactoring makes absolutely no sense at all, and the inner loop is better
left alone as it is.  In the example above, though, I think the extraction
can help.