filecmp.dircmp performance

Peter Otten __peter__ at web.de
Sat Jan 8 11:28:56 EST 2011


dads wrote:

> I'm creating a one way sync program, it's to automate backing up data
> over the wan from our shops to a server at head office. It uses
> filecmp.dircmp() but the performance seems poor to me.
> 
>         for x in dc.diff_files:
>             srcfp = os.path.join(src, x)
>             self.fn777(srcfp)
>             if os.path.isfile(srcfp):
>                 try:
>                     shutil.copy2(srcfp, dst)
>                     self.lg.add_diffiles(src, x)
>                 except Exception, e:
>                     self.lg.add_errors(e)
> 
> I tested it at a store which is only around 50 miles away on a 10Mbps
> line, the directory has 59 files that are under 100KB. When it gets to
> dc.diff_files it takes 15mins to complete. Looking at the filecmp.py
> it's only using os.stat, it seems excessively long.

As a baseline it would be interesting to see how long it takes to copy those 
59 files using system tools. 

However, there are efficient tools out there that work hard to reduce the 
traffic over the net which is likely to be the bottleneck. I suggest that 
you have have a look at

http://en.wikipedia.org/wiki/Rsync



More information about the Python-list mailing list