Parallel(?) programming with python

Wed Aug 10 14:54:36 EDT 2022

There are many possible discussions we can have here and some are not really
about whether and how to use Python.

The user asked how to do what is a fairly standard task for some people and
arguably is not necessarily best done using a single application running
things in parallel. 

So, yes, if you have full access to your machine and can schedule tasks,
then some obvious answers come to mind where one process listens and
receives data and stores it, and another process periodically wakes up and
grabs recent data and processes it and perhaps still another process comes
up even less often and does some re-arrangement of old data.

And, yes, for such large volumes of data it may be a poor design to hold all
the data in memory for many hours or even days and various ways of using a
database or files/folders with a naming structure are a good idea.

But the original question remains, in my opinion, a not horrible one. All
kinds of applications can be written with sets of tasks run largely in
parallel with some form of communication between tasks using shared data
structures like queues and perhaps locks and with a requirement that any
tasks that take nontrivial time need a way to buffer any communications to
not block others. 

Also, for people who want to start ONE process and let it run, and perhaps
may not be able to easily schedule other processes on a system level, it can
be advantageous to know how to set up something along those lines within a
single python session.

Of course, for efficiency reasons, any I/O to files slows things down but
what is described here as the situation seems to be somewhat easier and
safer to do in so many other ways. I think a main point is that there are
good ways to avoid the data from being acted on by two parties that share
memory. One is NOT to share memory for this purpose. Another might be to
have the 6-hour process use a lock to move the data aside or send a message
to the receiving process to pause a moment and set the data aside and begin
collecting anew while the old is processed and so on.

There are many such choices and the parts need not be in the same process or
all written in python. But some solutions can be generalized easier than
others. For example, can there become a need to collect data from multiple
sources, perhaps using multiple listeners?

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of Dieter Maurer
Sent: Wednesday, August 10, 2022 1:33 PM
To: Schachner, Joseph (US) <Joseph.Schachner at Teledyne.com>
Cc: Andreas Croci <andrea.croci at gmx.de>; python-list at python.org
Subject: RE: Parallel(?) programming with python

Schachner, Joseph (US) wrote at 2022-8-9 17:04 +0000:
>Why would this application *require* parallel programming?   This could be
done in one, single thread program.   Call time to get time and save it as
start_time.   Keep a count of the number of 6 hour intervals, initialize it
to 0.

You could also use the `sched` module from Python's library.
-- 
https://mail.python.org/mailman/listinfo/python-list