iterating over multi-line string

Sun Sep 11 17:30:30 EDT 2016

On 9/11/2016 11:34 AM, Doug OLeary wrote:
> Hey;
>
> I have a multi-line string that's the result of reading a file filled with 'dirty' text.  I read the file in one swoop to make data cleanup a bit easier - getting rid of extraneous tabs, spaces, newlines, etc.  That part's done.
>
> Now, I want to collect data in each section of the data.  Sections are started with a specific header and end when the next header is found.
>
> ^1\. Upgrade to the latest version of Apache HTTPD
> ^2\. Disable insecure TLS/SSL protocol support
> ^3\. Disable SSLv2, SSLv3, and TLS 1.0. The best solution is to only have TLS 1.2 enabled
> ^4\. Disable HTTP TRACE Method for Apache
> [[snip]]
>
> There's something like 60 lines of worthless text before that first header line so I thought I'd skip through them with:
>
> x=0  # Current index
> hx=1 # human readable index
> rgs = '^' + str(hx) + r'\. ' + monster['vulns'][x]
> hdr = re.compile(rgs)
> for l in data.splitlines():
>   while not hdr.match(l):
>     next(l)
>   print(l)
>
> which resulted in a typeerror stating that str is not an iterator.  More googling resulted in:

You are iterating in two different ways.  Not surprising it does not 
work.  The re is also unnecessary.  The following seems to do what you 
want and is much simpler.

data = '''\
junk
more junk
^1\. upgrade
^2\. etc
'''
data = data[data.index(r'^1\.'):]
print(data)
#
^1\. upgrade
^2\. etc

-- 
Terry Jan Reedy