Monitor a FTP site for arrival of new/updated files

Steve Holden steve at holdenweb.com
Sun Jan 25 15:11:36 EST 2009


python at bdurham.com wrote:
>  Any suggestions on a best practice way to monitor a remote FTP site for
> the arrival of new/updated files? I don't need specific code, just some
> coaching on technique based on your real-world experience including
> suggestions for a utility vs. code based solution.
> 
> My goal is to maintain a local collection of files synced with a remote
> FTP site and when I download a new/updated file locally, run a script to
> process it. The arrival and format of the files that I need to sync with
> are beyond my control (eliminating a rsync solution) ... all I have is a
> generic FTP connection to a specific FTP address. Note: The remote site
> I'm monitoring may have multiple uploads occuring at the same time.
> 
> My basic strategy is to poll the remote directory on a regular basis and
> compare the new directory listing to a previous snapshot of the
> directory listing. If a file timestamp or size has changed (or a new
> file has appeared), then track this file as a changed file. Once a file
> has been marked as changed, wait <N> polling cycles for the file
> timestamp and size to remain stable, then download it, and trigger a
> local script to process the file. In addition to detecting new or
> changed files, I would compare remote directory listings to my local
> sync folder and delete local files that no longer exist on the remote site.
> 
> My concern about using a utility is the utility's ability to detect when
> a remote file has finished being updated. I don't want to download files
> that are still in the process of being updated - I only want to download
> new/updated files after they've been closed on the remote site.
> 
> Any ideas appreciated!
> 
Well, the ftpmirror will cope with most of what you want to do as it is,
but I am unsure how you can determine whether a file is in the process
of being written on the server.

regards
 Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC              http://www.holdenweb.com/




More information about the Python-list mailing list