Extract all words between two keywords in .txt file (Python)

Ben Bacarisse ben.usenet at bsb.me.uk
Thu Dec 12 07:40:24 EST 2019


A S <aishan0403 at gmail.com> writes:

> On Thursday, 12 December 2019 02:28:09 UTC+8, Ben Bacarisse  wrote:
>> A S <aishan0403 at gmail.com> writes:
>> 
>> > I would like to extract all words within specific keywords in a .txt
>> > file. For the keywords, there is a starting keyword of "PROC SQL;" (I
>> > need this to be case insensitive) and the ending keyword could be
>> > either "RUN;", "quit;" or "QUIT;". This is my sample .txt file.
>> >
>> > Thus far, this is my code:
>> >
>> > with open('lan sample text file1.txt') as file:
>> >     text = file.read()
>> >     regex = re.compile(r'(PROC SQL;|proc sql;(.*?)RUN;|quit;|QUIT;)')
>> >     k = regex.findall(text)
>> >     print(k)
>> 
>> Try
>> 
>>   re.compile(r'(?si)(PROC SQL;.*(?:QUIT|RUN);)')
<cut> 
>
> Hey Ben, this works for my sample .txt file! Thanks:) but it wont
> work, if I have other multiple text files to parse through that, are
> similar but have some variations, strangely enough.

No one can help without details.  Maybe need the non-greedy .*? rather
than the .* I put in there.  That was not a deliberate change.

-- 
Ben.


More information about the Python-list mailing list