Newbie help with manipulating text files

Alex Martelli aleaxit at yahoo.com
Fri May 25 15:16:23 EDT 2001


"jk" <user at host.com> wrote in message
news:20010525.181114.1143408282.868 at dsl-64-194-28-125.telocity.com...
> I hope someone can help me with this problem, I know I am just thinking
> about it the wrong way.
>
> I have a file that I want to divide into three parts: a header, a section
> I want to sort, and a footer. The header and footer are not in a
> consistent format, but the middle part I want to sort has a bunch of
> lines that all start with the same word.

Is the header all lines UP TO the first one in the middle part,
excluded?  I.e., up to but excluding the one that starts with
the key word?


> So the file looks something like this:
>
>
> header header header this is a header
> this is a headerthis is a headerthis is a header
> this is a headerthis is a headerthis is a header
> yep this is a header
>
>
> middle part z
> middle part a
> middle part c
> middle part d
> middle part e
>
> footer this is the footerfooter this is the footerfooter this is the
> footerfooter this is the footerfooter this is the footerfooter this is the
> footerfooter this is the footerfooter this is the footerfooter this is the
> footerfooter this is the footerfooter this is the footerfooter this is the
> footerfooter this is the footerfooter this is the footer

...or are the white lines the separator?  Or what...?


> I've been able to use re to grab all of the middle part and put it into a
> sortable list, but I want to be able to print out the whole file with the
> middle part sorted but the header and footer unchanged. I think that what

Ok.  You don't seem to need re, though.

> I want is to read the whole thing into a list and then split the list
> into 3 sublists, manipulate the list I want and then write out a new file
> with the modified lists joined together. The part I don't know how to do
> is to get the header and footer into their own lists.

Let's assume for the sake of argument that the middle part is made
up of all, and only, those lines starting with 'middle part', OK?


wholefile = thefile.readlines()
lastmidl = len(wholefile)

for i in range(lastmidl):
    if wholefile[i].startswith('middle part'):
        firstmidl = i
        break
else:
    firstmidl = lastmidl

for i in range(firstmidl, lastmidl):
    if not wholefile[i].startswith('middle part'):
        lastmidl = i
        break

midl = wholefile[firstmidl:lastmidl]
midl.sort()
wholefile[firstmidl:lastmidl] = midl

open('newfile.txt','w').writelines(wholefile)


This, if I understood your specs correctly, should be roughly what
you want... haven't tested the code, and there may be limit cases
I didn't think of, but I hope this gives you the overall idea!


Alex






More information about the Python-list mailing list