Suggestions on mechanism or existing code - maintain persistence of file download history

R.Wieser address at not.available
Thu Jan 30 03:35:07 EST 2020


jkn,

> MRAB's scheme does have the disadvantages to me that Chris has pointed 
> out.

Nothing that can't be countered by keeping copies of the last X number of 
to-be-dowloaded-URLs files.

As for rewriting every time, you will /have/ to write something for every 
action (and flush the file!), if you think you should be able to ctrl-c (or 
worse) out of the program.

But, you could opt to write this sessions successfully downloaded URLs to a 
seperate file, and only merge that with the origional one program start. 
That together with an integrity check of the seperate file (eventually on a 
line-by-line (URL) basis) should make the origional files corruption rather 
unlikely.


A database /sounds/ good, but what happens when you ctrl-c outof a 
non-atomic operation ?   How do you fix that ?    IOW: Databases can be 
corrupted for pretty-much the same reason as for a simple datafile (but with 
much worse consequences).

Also think of the old adagio: "I had a problem, and than I thought I could 
use X.  Now I have two problems..." - with X traditionally being "regular 
expressions".   In other words: do KISS (keep it ....)


By the way: The "just write the URLs in a folder" method is not at all a bad 
one.   /Very/ easy to maintain, resilent (especially when you consider the 
self-repairing capabilities of some filesystems) and the polar opposite of a 
"customer lock-in". :-)

Regards,
Rudy Wieser




More information about the Python-list mailing list