[ python-Bugs-1234674 ] filecmp.cmp's "shallow" option

SourceForge.net noreply at sourceforge.net
Fri Aug 26 10:19:58 CEST 2005


Bugs item #1234674, was opened at 2005-07-08 11:01
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1234674&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Mendez (goatsofmendez)
>Assigned to: Raymond Hettinger (rhettinger)
Summary: filecmp.cmp's "shallow" option

Initial Comment:
The filecmp.cmp function has a shallow option (set as
default) to only compare files based on stats rather
than doing a bit by bit comparison of the file itself.
The relevant bit of the code follows.

    s1 = _sig(os.stat(f1))
    s2 = _sig(os.stat(f2))
    if s1[0] != stat.S_IFREG or s2[0] != stat.S_IFREG:
        return False
    if shallow and s1 == s2:
        return True
    if s1[1] != s2[1]:
        return False

    result = _cache.get((f1, f2))
    if result and (s1, s2) == result[:2]:
        return result[2]
    outcome = _do_cmp(f1, f2)
    _cache[f1, f2] = s1, s2, outcome
    return outcome

There's a check to see if the shallow mode is enabled
and if that's the case and the stats match it returns
true but the test for returning false is for only one
of the stats attributes meaning that it's possible for
a file not to match either test and the function to
carry on to the full file check lower down.

The check above to see if the stats match with
stat.S_IFREG also looks wrong to me, but it could just
be I don't understand what he's trying to do :)

This code is the same in both 2.3 and 2.4

----------------------------------------------------------------------

>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-26 10:19

Message:
Logged In: YES 
user_id=1188172

Hm. Looks like if the size is identical, but the mtime is
not, the file will be read even in shallow mode.

The filecmp docs say "Unless shallow is given and is false,
files with identical os.stat() signatures are taken to be
equal."
The filecmp.cmp docstring says "shallow: Just check stat
signature (do not read the files)"

Two questions arise:
- Should the file contents be compared in shallow mode if
the mtimes do not match?
- Should the mtimes matter in non-shallow mode?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1234674&group_id=5470


More information about the Python-bugs-list mailing list