[Tutor] how to print lines which contain matching words or strings

Asad asad.hasan2004 at gmail.com
Mon Nov 19 22:15:08 EST 2018


Hi Avi Gross /All,

             Thanks for the reply. Yes you are correct , I would like to to
open a file and process a line at a time from the file and want to select
just lines that meet my criteria and print them while ignoring the rest. i
have created the following code :


   import re
   import os

   f3 = open(r'file1.txt',r)
   f = f3.readlines()
   d = []
   for linenum in range(len(f)):
        if re.search("ERR-1" ,f[linenum])
           print f[linenum]
           break
        if re.search("\d\d\d\d\d\d",f[linenum])   --- > seach for a patch
number length of six digits for example 123456
           print f[line]
           break
        if re.search("Good Morning",f[linenum])
           print f[line]
           break
        if re.search("Breakfast",f[linenum])
           print f[line]
           break
        ...
        further 5 more hetrogeneus if conditions I have

=======================================================================
This is beginners approach to print the lines which match the if conditions
.

How should I make it better may be create a dictionary of search items or a
list and then iterate over the lines in a file to print the lines matching
the condition.


Please advice ,

Thanks,

Previous email :
======================================================================

Asad,

As others have already pointed out, your request is far from clear.

Ignoring the strange use of words, and trying to get the gist of the
request, would this be close to what you wanted to say?

You have a file you want to open and process a line at a time. You want to
select just lines that meet your criteria and print them while ignoring the
rest.

So what are the criteria? It sounds like you have a list of criteria that
might be called patterns. Your example shows a heterogenous collection:

[A ,"B is good" ,123456 , "C "]

A is either an error or the name of a variable that contains something. We
might want a hint as searching for any old object makes no sense.

The second and fourth are exact strings. No special regular expression
pattern. Searching for them is trivial using normal string functionality.
Assuming they can be anywhere in a line:

>>> line1 = "Vitamin B is good for you and so is vitamin C"
>>> line2 = "Currently nonsensical."
>>> line3 = ""
>>> "B is good" in line1
True
>>> "B is good" in line2
False
>>> "B is good" in line3
False
>>> "C" in line1
True
>>> "C" in line2
True
>>> "C" in line2
True

To test everything in a list, you need code like for each line:

for whatever in [A ,"B is good" ,123456 , "C "]
    If whatever in line: print(line)

Actually, the above could print multiple copies so you should break out
after any one matches.

123456 is a challenge to match. You could search for str(whatever) perhaps.

Enough. First explain what you really want.

If you want to do a more general search using regular expressions, then the
list of things to search for would be all the string in RE format. You could
search multiple times or use the OR operator carefully inside one regular
expression. You have not stated any need to tell what was matched or where
it is the line so that would be yet another story.

-----Original Message-----
From: Tutor <tutor-bounces+avigross=verizon.net at python.org> On Behalf Of
Asad
Sent: Sunday, November 18, 2018 10:19 AM
To: tutor at python.org
Subject: [Tutor] how to print lines which contain matching words or strings

Hi All ,

       I have a set of words and strings :

like :

p = [A ,"B is good" ,123456 , "C "]

I have a file in which I need to print only the lines which matches the
pattern in p

thanks,


On Tue, Nov 20, 2018 at 6:12 AM <tutor-request at python.org> wrote:

> Send Tutor mailing list submissions to
>         tutor at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.python.org/mailman/listinfo/tutor
> or, via email, send a message with subject or body 'help' to
>         tutor-request at python.org
>
> You can reach the person managing the list at
>         tutor-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Tutor digest..."
> Today's Topics:
>
>    1. Re: seeking beginners tutorial for async (Mats Wichmann)
>    2. Re: seeking beginners tutorial for async (Bob Gailer)
>    3. Re: how to print lines which contain matching words or
>       strings (Avi Gross)
>    4. [Python 3] Threads status, join() and Semaphore queue
>       (Dimitar Ivanov)
>
>
>
> ---------- Forwarded message ----------
> From: Mats Wichmann <mats at wichmann.us>
> To: tutor at python.org
> Cc:
> Bcc:
> Date: Mon, 19 Nov 2018 10:05:35 -0700
> Subject: Re: [Tutor] seeking beginners tutorial for async
> On 11/18/18 4:50 PM, bob gailer wrote:
> > I have yet to find a tutorial that helps me understand and apply async!
> >
> > The ones I have found are either incomplete, or they wrap some other
> > service, or they are immediately so complex that I have no hope of
> > understanding them.
> >
> > I did find a useful javascript tutorial at
> > https://javascript.info/promise-basics, but trying to map it to python
> > is very frustrating.
> >
> > The python docs also do not help.
>
> Can you be more specific what you're looking for?
>
> There are a lot of aspects to async programming in Python, and there's
> been a lot of language-support changes in recent versions.  This just
> means there will be a lot of differing efforts to explain things based
> on when the tutorial was written.  Plus async, while the way we work as
> humans, is different to the way we tend to think of programming (yes,
> I'm speaking for myself here) unless you're a GUI programmer, where
> event loops are old hat.  "Wraps some other service" - several tutorials
> I've glanced at do a simple webserver, because that's an example where
> we understand why synchronous doesn't work, and the basic concepts are
> pretty simple (good webservers are _just_ a little harder, of course).
>
>
>
>
>
> ---------- Forwarded message ----------
> From: Bob Gailer <bgailer at gmail.com>
> To: Mats Wichmann <mats at wichmann.us>
> Cc: tutor at python.org
> Bcc:
> Date: Mon, 19 Nov 2018 16:13:58 -0500
> Subject: Re: [Tutor] seeking beginners tutorial for async
> > Can you be more specific what you're looking for?
>
> For starters a minimal executable  program that uses the async keyword.
>
> On the JavaScript side this is trivial and easily understood.
>
> I did find in the python documentation a hello world program that uses
> async IO. It helped me understand how to build an event Loop , start soon,
> start later stop Loop, run forever and run until complete. That was very
> helpful. But it did not introduce async.
>
> I'd like to see  the trivial program built up step-by-step adding one new
> feature at a time so that I can understand exactly what that feature does.
>
> I am talking about python 3.6 and 3.7.
>
> Thank you for asking for the clarification, I hope this helps.
>
>
>
>
> ---------- Forwarded message ----------
> From: Avi Gross <avigross at verizon.net>
> To: <tutor at python.org>
> Cc:
> Bcc:
> Date: Sun, 18 Nov 2018 20:13:41 -0500
> Subject: Re: [Tutor] how to print lines which contain matching words or
> strings
> Asad,
>
> As others have already pointed out, your request is far from clear.
>
> Ignoring the strange use of words, and trying to get the gist of the
> request, would this be close to what you wanted to say?
>
> You have a file you want to open and process a line at a time. You want to
> select just lines that meet your criteria and print them while ignoring the
> rest.
>
> So what are the criteria? It sounds like you have a list of criteria that
> might be called patterns. Your example shows a heterogenous collection:
>
> [A ,"B is good" ,123456 , "C "]
>
> A is either an error or the name of a variable that contains something. We
> might want a hint as searching for any old object makes no sense.
>
> The second and fourth are exact strings. No special regular expression
> pattern. Searching for them is trivial using normal string functionality.
> Assuming they can be anywhere in a line:
>
> >>> line1 = "Vitamin B is good for you and so is vitamin C"
> >>> line2 = "Currently nonsensical."
> >>> line3 = ""
> >>> "B is good" in line1
> True
> >>> "B is good" in line2
> False
> >>> "B is good" in line3
> False
> >>> "C" in line1
> True
> >>> "C" in line2
> True
> >>> "C" in line2
> True
>
> To test everything in a list, you need code like for each line:
>
> for whatever in [A ,"B is good" ,123456 , "C "]
>     If whatever in line: print(line)
>
> Actually, the above could print multiple copies so you should break out
> after any one matches.
>
> 123456 is a challenge to match. You could search for str(whatever) perhaps.
>
> Enough. First explain what you really want.
>
> If you want to do a more general search using regular expressions, then the
> list of things to search for would be all the string in RE format. You
> could
> search multiple times or use the OR operator carefully inside one regular
> expression. You have not stated any need to tell what was matched or where
> it is the line so that would be yet another story.
>
> -----Original Message-----
> From: Tutor <tutor-bounces+avigross=verizon.net at python.org> On Behalf Of
> Asad
> Sent: Sunday, November 18, 2018 10:19 AM
> To: tutor at python.org
> Subject: [Tutor] how to print lines which contain matching words or strings
>
> Hi All ,
>
>        I have a set of words and strings :
>
> like :
>
> p = [A ,"B is good" ,123456 , "C "]
>
> I have a file in which I need to print only the lines which matches the
> pattern in p
>
> thanks,
>
> --
> Asad Hasan
> +91 9582111698
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
>
>
>
>
> ---------- Forwarded message ----------
> From: Dimitar Ivanov <dimitarxivanov at gmail.com>
> To: tutor at python.org
> Cc:
> Bcc:
> Date: Mon, 19 Nov 2018 23:52:49 +0000
> Subject: [Tutor] [Python 3] Threads status, join() and Semaphore queue
> Hello everyone,
>
> I'm having a hard time getting my head around threads so I was hoping
> someone who has better understanding of their underlying functionality
> could lend me a helping hand, in particular how threads work with each
> other when using thread.join() and Semaphore set with maximum value. I'll
> try to keep it as clear and concise as possible, but please don't hesitate
> to ask if anything about my approach is unclear or, frankly, awful.
>
> I'm writing a script that performs a couple of I/O operations and CLI
> commands for each element in a list of IDs. The whole process takes a while
> and may vary based on the ID, hence the threading approach sounded like the
> best fit since next ID can start once space has freed up. I'm parsing an
> extract of my code below and will explain what I can't properly understand
> underneath.
>
> Note: Please ignore any syntax typos, I'm rewriting the code to make it
> suitable for here.
>
>
> file1.py
> ---------
> ids = [<IDs listed here>]
> threadsPool = []
> for id in ids:
>   thread = threading.Thread(target=file2.runStuff, name=str(id), args=(id,
> ))
>   threadsPool.append(thread)
> for thread in threadsPool:
>   thread.start()
> for thread in threadsPool:
>   print(thread.enumerate())
>   print("Queuing thread" + str(thread))
>   thread.join()
>
> file2.py
> ----------
> queue = threading.Semaphore(2)
> def runStuff(id):
>   queue.acquire()
>   print("Lock acquired for " + str(id))
>   file3.doMoreStuff()
>   file4.evenMoreStuff()
>   queue.release()
>
>
> Onto my confusion - as long as I don't try to print information about the
> thread that's being queued or the total amount of threads using
> .enumerate(), the script is working absolutely flawlessly, each thread that
> doesn't have a lock is waiting until it acquires it and then moves on. I
> decided it'd be nice to be able to provide more information about which
> thread starts next and how many threads are active right now (each can take
> a different amount of time), however, when I tried to do that, my log was
> showing me some pretty funky output which at first made me believe I've
> messed up all my threads, example:
>
>
> <<  2018-11-19 15:01:38,094 file2 [ID09] INFO - Lock acquired for
> ID09                 <---- this is from file2.py
> ------ some time later and other logs in here ---------
> [<_MainThread(MainThread, started 140431033562880)>, <Thread(ID09, started
> 140430614177536)>] <---- output from thread.enumerate(), file1.py
> <<  2018-11-19 15:01:38,103 file1 [MainThread] DEBUG - Queuing thread -
> <Thread(ID09, started 140430614177536)> <---- output from print() right
> after thread.enumerate()
>
>
> After some head scratching, I believe I've finally tracked down the reason
> for my confusion:
>
> The .start() loop starts the threads and the first 2 acquire a lock
> immediately and start running, later on the .join() queue puts the rest in
> waiting for lock, that's fine, what I didn't realize, of course, is that
> the .join() loop goes through threads that have already been instantly
> kicked off by the .start() loop (the first 2 threads since Semaphore allows
> 2 locks) and then my print in that loop is telling me that those threads
> are being queued, except they aren't since they are already running, it's
> just my text is telling me that, since I wasn't smart enough to realize
> what's about to happen, as seen below:
>
> <<  2018-11-19 15:01:33,094 file1.py [MainThread] DEBUG - Queuing thread -
> <Thread(ID02, stopped 140430666626816)> <--- makes it clear the thread has
> already even finished
>
> Which finally gets me to my cry for help - I know I can't modify the
> threadsPool list to remove the threads already created on the fly, so I can
> have only the ones pending to be queued in the 2nd loop, but for the life
> of me I can't think of a proper way to try and extract some information
> about what threads are still going (or rather, have finished since
> thread.enumerate() shows both running and queued).
>
> I have the feeling I'm using a very wrong approach in trying to extract
> that information in the .join() loop, since it only goes back to it once a
> thread has finished, but at the same time it feels like the perfect timing.
> I feel like (and I might be very wrong) if I could only have the threads
> that are actually being queued in there (getting rid of the ones started
> initially), my print(thread) will be the absolute sufficient amount of
> information I want to display.
>
> And just in case you are wondering why I have my threads starting in
> file1.py and my Semaphore queue in file2.py, it's because I wanted to split
> the runStuff(id) function in a separate module due to its length. I don't
> know if it's a good way to do it, but thankfully the Python interpreter is
> smart enough to see through my ignorance.
>
> I'm also really sorry for the wall of text, I just hope the information
> provided is enough to clear up the situation I'm in and what I'm struggling
> with.
>
> Thank you in advance and with kindest regards,
> Dimitar
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> https://mail.python.org/mailman/listinfo/tutor
>


-- 
Asad Hasan
+91 9582111698


More information about the Tutor mailing list