[New-bugs-announce] [issue18149] filecmp.cmp() - cache invalidation fails when file modification times haven't changed

Matej Fröbe report at bugs.python.org
Thu Jun 6 16:07:35 CEST 2013


New submission from Matej Fröbe:

Example:

  with open('file1', 'w') as f:
    f.write('a')

  with open('file2', 'w') as f:
    f.write('a')
    
  print filecmp.cmp('file1', 'file2', shallow=False) # true

  with open('file2', 'w') as f:
    f.write('b')

  print filecmp.cmp('file1', 'file2', shallow=False) # true




Because of the caching, both calls to filecmp.cmp() return true on my system.

When retrieving value from cache, the function filecmp.cmp() checks the signatures of the files:

  s1 = _sig(os.stat(f1))
  s2 = _sig(os.stat(f2))
  ...
  outcome = _cache.get((f1, f2, s1, s2))

But the signatures in cache are the same, if the file sizes and times of modification (os.stat().st_mtime) haven't changed from the last call, even if the content has changed.

The buffer is mentioned in the documentation, but there isn't any documented way to clear it. It also isn't nice IMO, that one has to worry about the file system's resolution of the file modification time when calling a simple file comparison.

----------
components: Library (Lib)
messages: 190715
nosy: fbm
priority: normal
severity: normal
status: open
title: filecmp.cmp() - cache invalidation fails when file modification times haven't changed
type: behavior
versions: Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18149>
_______________________________________


More information about the New-bugs-announce mailing list