[Tutor] Running Python Scripts at same time

John Weller john at johnweller.co.uk
Sun Jun 28 05:59:16 EDT 2020


Thank you to all who responded.  I was conscious of the file locking issue and had a strategy in mind to cope.  My concern was if Python could manage.  I am an experienced programmer new to Python and had visions of the early interpreted languages I used such as those on the BBC Micro and early PCs  😊

Thanks again

John

John Weller
01380 723235
07976 393631

-----Original Message-----
From: Cameron Simpson <cs at cskk.id.au> 
Sent: 28 June 2020 00:06
To: John Weller <john at johnweller.co.uk>
Cc: 'Python Tutor' <tutor at python.org>
Subject: Re: [Tutor] Running Python Scripts at same time

On 26Jun2020 11:21, John Weller <john at johnweller.co.uk> wrote:
>I have a Python program which will be running 24/7 (I hope 😊).  It is 
>generating data in a file which I want to clean up overnight.  The way 
>I am looking at doing it is to run a separate program as a Cron job at 
>midnight – will that work?  The alternative is to add it to the loop 
>and check for the time. I have tried researching this but only got even 
>more confused.

Running a separate program is perfectly reasonable.

And crontab is a perfect place for a regular task like this.

The primary issue usually is that you do not want both programms to be using the file at the same time.

Supposing the file were, say, a CSV file to which your long running programme (A) appended data. ANd that the clean up program (B) reads the CSV file, tidies some stuff, and rewrites the CSV file. You can imagine this sequence:

    - programme B opens the file and reads the data
    - programme B thinks about the data to clean it
    - programme A appends more data to the file
    - programme B rewrites the clean data into the file,
      _overwriting_ the new data programme A just appended

The usual process with a shared external file is to use a lock facility.  
These come in a few forms, and it is essential that both programme A and programme B use the same locking system.

One of the easiest and most portable is to make a lock file while you work with the file. If your data file is called "foo" you might use a lock fie called "foo.lock".

On a UNIX type system (includes Linux) you can atomicly make such a file like this:

    import os
    .......
    lockpath = datafilepath + '.lock'
    lockfd = os.open(lockpath, os.O_CREAT | os.O_EXCL | os.O_RDWR, 0)

That is a special mode of the OS "open" call (_not_ Python's default "open" builtin) whose parameters have the following meanings:

    - os.O_CREAT: create the file if missing
    - os.O_EXCL: ensure that the file is created - if it already exists 
      this raises an exception
    - os.O_RDWR: open the file for read and write
    - 0: the initial permissions, ensuring that the file is _not_ 
      readable or writable

See "man 2 open" on a UNIX system for the spec.

The combination of O_RDWR and 0 permissions means that if the file already exists (made by the "other" programme) then it won't have any permissions, which means we won't get read or write access and the open will fail. The nice thing about this is that the initial permissions are _immediate_ when the file is created by the OS - there's no tiny window where the file has read/write perms which then get removed - the OS ensures it. This is nice on networked file shares (if they are reliable).

Anyway, the upshort of the os.open() call above is that if the lockfile already exists, the open will fail, and otherwise it will succeed, preventing antoehr programme doing the same thing.

When finished, close the lockfd and remove the lock file:

    os.close(lockfd)
    os.remove(lockpath)

No, because the whole scenario is that occasionally both programms want the file at the same time, the os.open _will_ fail in that case. SO the idea is that you repeat it until it succeeds, then do your work:

    while True:
        try:
            lockfd = os.open(lockpath, os.O_CREAT | os.O_EXCL | os.O_RDWR, 0)
        except OSError as e:
            print("lock not obtained, sleeping")
            time.sleep(1)
        else:
            break
    .... work with the data file ...
    os.close(lockfd)
    os.remove(lockpath)

Put that logic in both programmes and you should be ok.

You can see a more elaborate version of this logic in my "makelockfile" 
function here:

    https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/fileutils.py#lines-527

(Atlassian are going to nuke that repo soon, alas, because they find mercurial too hard. But until then the link should be good.)

Cheers,
Cameron Simpson <cs at cskk.id.au>



More information about the Tutor mailing list