Suggestions on mechanism or existing code - maintain persistence of file download history

Peter J. Holzer hjp-python at hjp.at
Sat Feb 1 06:58:15 EST 2020


On 2020-01-30 07:56:30 +1100, Chris Angelico wrote:
> On Thu, Jan 30, 2020 at 7:49 AM MRAB <python at mrabarnett.plus.com> wrote:
> > On 2020-01-29 20:00, jkn wrote:
> > > I could have a file with all the URLs listed and work through each line in turn.
> > > But then I would have to rewrite the file (say, with the previously-successful
> > > lines commented out) as I go.
> > >
> > Why comment out the lines yourself when the download manager could do it
> > for you?
> >
> > Load the list from disk.
> >
> > For each uncommented line:
> >
> >      Download the file.
> >
> >      Comment out the line.
> >
> >      Write the list back to disk.
> 
> Isn't that exactly what the OP was talking about? It involves
> rewriting the file at every step, with the consequent risks of
> trampling on other changes, corruption on error, etc, etc, etc.

If you do it right, the risk is small:

    1) read file,
    2) do the download
    3) write temporary file with changes
    4) rename temporary file to file.

Remaining risks:

    Someone might start the download manager twice. This can be
    prevented with a lock file.

    The computer might crash between 3 and 4 (or shortly after 4). In
    this case you might lose the contents of file. This can be prevented
    with proper use of fsync, but at this point it gets complicated and
    filesystem dependent (there were a number of papers and talks about
    this topic over the last few years), so personally I would live with
    the risk or use a database.

        Subrisk: Your disk might lie to your computer about having
        stored the data. In this case there is nothing you can do except
        buying a better disk.

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp at hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20200201/e596d769/attachment.sig>


More information about the Python-list mailing list