Parallel(?) programming with python

Cameron Simpson cs at cskk.id.au
Mon Aug 8 22:30:53 EDT 2022


On 09Aug2022 00:22, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>On Mon, 8 Aug 2022 at 19:01, Andreas Croci <andrea.croci at gmx.de> wrote:
>> Basically the question boils down to wether it is possible to have 
>> parts
>> of a program (could be functions) that keep doing their job while other
>> parts do something else on the same data, and what is the best way to do
>> this.

Which is of course feasible, as others have outlined.

>Why do these "parts of a program" need to be part of the *same*
>program. I would write this as just two separate programs. One
>collects the data and writes it to a file. The other periodically
>reads the file and computes the DFT.

I would also write these as separate programmes, or at least as distinct 
modes of the same programme (eg "myprog poll" and "myprog archive" etc).  
Largely because you might run the "poll" regularly and briefly, and the 
processes phase separately and less frequently. You don't need to keep a 
single programme lurking around forever - fire it up as required.

However, I want to point out that this _in no way_ removes the need for 
access contol and mutexes. It will change the mechanism (because your 
two programmes are now operating separately) and makes it more concrete 
in your mind what _actually and precisely_ needs protection.

For example, you probably want to avoid _processing_ a data file at the 
same time as _updating_ that file. Depending on what you're doing this 
can be as simple as keeping "to be updated" files with distinct names 
from "available to be processed/archived" files. This is a standard 
difficulty with "hot folder" upload areas.

A common approach might be to write a file with a "temp" style name (eg 
".tmp*") until completed, then rename it to its official name (eg 
"datafile*"). And then your processing/archiving side can simply ignore 
the "in progress" files because they do not match the names it cares 
about.

Anyway, those are specifics, which will be driven by what you're 
actually doing. The point is that you still need to coordinate use of 
the files suitably for your needs. Doing this in one long running 
programme with Threads/mutexes or separate programmes sharing a data 
directory just changes the mechanisms.

Cheers,
Cameron Simpson <cs at cskk.id.au>


More information about the Python-list mailing list