Newbie help with manipulating text files

Tom Good Tom_Good1 at excite.com
Fri May 25 18:28:06 EDT 2001


"jk" <user at host.com> wrote in message news:<20010525.181114.1143408282.868 at dsl-64-194-28-125.telocity.com>...
> I hope someone can help me with this problem, I know I am just thinking
> about it the wrong way.
> 
> I have a file that I want to divide into three parts: a header, a section
> I want to sort, and a footer. The header and footer are not in a
> consistent format, but the middle part I want to sort has a bunch of
> lines that all start with the same word. 

Maybe there's an even better way, but this works for me, and may give you some ideas.



import re
import string

regex_for_midsection = "^middle"

inputLines = """
header header header this is a header
this is a headerthis is a headerthis is a header
this is a headerthis is a headerthis is a header
yep this is a header

middle part z
middle part a
middle part c
middle part d
middle part e

footer this is the footerfooter this is the footerfooter this is the
footerfooter this is the footerfooter this is the footerfooter this is the
footerfooter this is the footerfooter this is the footerfooter this is the
footerfooter this is the footerfooter this is the footerfooter this is the
footerfooter this is the footerfooter this is the footer

"""

def firstIndex(L, func):
    "find index of first item in L where func(item) is true"
    for i in range(len(L)):
        if func(L[i]):
            return i
    raise "first index not found"


def lastIndex(L, func):
    "find index of last item in L where func(item) is true"
    for i in range(len(L)-1, -1, -1):
        if func(L[i]):
            return i
    raise "last index not found"


def main():
    lines = string.split(inputLines, "\n")
    
    def matchesMid(line):
        return (re.findall(regex_for_midsection, line) != [])

    firstMid = firstIndex(lines, matchesMid)
    lastMid = lastIndex(lines, matchesMid)

    start = lines[:firstMid]
    mid = lines[firstMid:lastMid+1]
    mid.sort()
    end = lines[lastMid+1:]

    for L in start + mid + end:
        print L


if __name__=="__main__":
    main()



More information about the Python-list mailing list