Need help in extracting lines from word using python

razinzamada at gmail.com razinzamada at gmail.com
Wed Mar 20 02:14:54 EDT 2013


Thanks DAVE

On Tuesday, March 19, 2013 8:24:24 PM UTC+5:30, Dave Angel wrote:
> On 03/19/2013 10:20 AM, razinzamada at gmail.com wrote:
> 
> > I'm currently trying to extract some data between 2 lines of an input file
> 
> 
> 
> Your subject line says "from word".  I'm only guessing that you might 
> 
> mean Microsoft Word, a proprietary program that does not, by default, 
> 
> save text files.  The following code and description assumes a text 
> 
> file, so there's a contradiction.
> 
> 
> 
> 
> 
> > using Python. the infile is set up such that there is a line -START- where I need the next 10 lines of code if and only if the -END- condition occurs before the next -START-. The -START- line occurs many times before the -END-. Heres a general example of what I mean:
> 
> >
> 
> 
> 
> In other words, you want to scan for -END-, then go backwards to -START- 
> 
> and use the first ten of the lines between?  Try coding it that way, and 
> 
> perhaps it'll be easier.
> 
> 
> 
> You also need to consider (and specify behavior for) the possibility 
> 
> that start and end are less than 10 lines apart.
> 
> 
> 
> > blah
> 
> > blah
> 
> > -START-
> 
> > 10 lines I DONT need
> 
> > blah
> 
> > -START-
> 
> > 10 lines I need
> 
> > blah
> 
> > blah
> 
> > -END-
> 
> > blah
> 
> > blah
> 
> > -START-
> 
> > 10 lines I dont need
> 
> > blah
> 
> > -START-
> 
> >
> 
> > .... and so on and so forth
> 
> >
> 
> > so far I have only been able to get the -START- + 10 lines for every iteration, but am at a total loss when it comes to specifying the condition to only write if the -END- condition comes before another -START- condition. I'm a bit of a newb, so any help will be greatly appreciated.
> 
> >
> 
> >
> 
> > heres the code I have for printing the -START- + 10 lines:
> 
> >
> 
> >      in = open('input.log')
> 
> >      out = open('output.txt', 'a')
> 
> >
> 
> >      lines = in.readlines()
> 
> >          for i, line in enumerate(lines):
> 
> >              if (line.find('START')) > -1:
> 
> >                  out.write(line)
> 
> >                  out.write(lines[i + 1])
> 
> >                  out.write(lines[i + 2])
> 
> >                  out.write(lines[i + 3])
> 
> >                  out.write(lines[i + 4])
> 
> >                  out.write(lines[i + 5])
> 
> >                  out.write(lines[i + 6])
> 
> >                  out.write(lines[i + 7])
> 
> >                  out.write(lines[i + 8])
> 
> >                  out.write(lines[i + 9])
> 
> >                  out.write(lines[i + 10])
> 
> 
> 
>      or just        out.write(lines[i:i+11)     to write out all 11 of them.
> 
> >
> 
> 
> 
> 
> 
> -- 
> 
> DaveA




More information about the Python-list mailing list