next line (data parsing)

robleachza at gmail.com robleachza at gmail.com
Thu Jan 17 11:35:24 EST 2008


I'm very appreciative for the comments posted. Thanks to each of you.
All good stuff.
Cheers,
-Rob


On Jan 16, 9:50 pm, George Sakkis <george.sak... at gmail.com> wrote:
> On Jan 17, 12:42 am, George Sakkis <george.sak... at gmail.com> wrote:
>
>
>
> > On Jan 17, 12:01 am, Scott David Daniels <Scott.Dani... at Acm.Org>
> > wrote:
>
> > > robleac... at gmail.com wrote:
> > > > Hi there,
> > > > I'm struggling to find a sensible way to process a large chuck of
> > > > data--line by line, but also having the ability to move to subsequent
> > > > 'next' lines within a for loop. I was hoping someone would be willing
> > > > to share some insights to help point me in the right direction. This
> > > > is not a file, so any file modules or methods available for files
> > > > parsing wouldn't apply.
>
> > > > I can iterate over each line by setting a for loop on the data object;
> > > > no problem. But basically my intension is to locate the line "Schedule
> > > > HOST" and progressively move on to the 'next' line, parsing out the
> > > > pieces I care about, until I then hit "Total", then I resume to the
> > > > start of the for loop which locates the next "Schedule HOST".
>
> > > if you can do:
>
> > >      for line in whatever:
> > >          ...
>
> > > then you can do:
>
> > >      source = iter(whatever)
> > >      for intro in source:
> > >          if intro.startswith('Schedule '):
> > >              for line in source:
> > >                  if line.startswith('Total'):
> > >                      break
> > >                  process(intro, line)
>
> > > --Scott David Daniels
> > > Scott.Dani... at Acm.Org
>
> > Or if you use this pattern often, you may extract it to a general
> > grouping function such ashttp://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/521877:
>
> Sorry, google groups fscked up with the auto linewrapping (is there a
> way to increase the line length?); here it is again:
>
> import re
>
> for line in iterblocks(source,
>         start = lambda line: line.startswith('Schedule HOST'),
>         end = lambda line: re.search(r'^\s*Total',line),
>         skip_delim = False):
>     process(line)
>
> George



More information about the Python-list mailing list