[issue32612] pathlib.(Pure)WindowsPaths can compare equal but refer to different files

Steve Dower report at bugs.python.org
Tue Jan 23 03:20:06 EST 2018


Steve Dower <steve.dower at python.org> added the comment:

(FWIW, I don't think your "security" argument is going to be very convincing, as this problem has been around for far too long to be treated as suddenly urgent. But up to you.)

My fear is that if PureWindowsPath stops handling the >90% of cases it currently does (by making "Path"!="path"), then we'll see developers implement their own workarounds, most likely by replacing:

    my_set.add(p)

with:

    my_dict[str(p).lower()] = p

This doesn't improve anything, and actually regresses it as I pointed out, so removing case folding cannot be the answer.

That leaves finding a better way to obtain hashes and comparisons for PureWindowsPath instances (note that all the functionality that appears to be at issue is inherited by WindowsPath, so there's presumably nothing to fix there). My initial guess would be that some combination of locale and flags to LcMapStringEx will give us a better sort key than .lower(), though I don't know what that combination is. The $UpCase file is not accessible directly, so that isn't any help.

I have heard that .upper() is considered to be "more accurate" for NTFS than .lower(), but have no confirmation. Calling stat is not an option for PureWindowsPath, so I'm running out of ideas on how to improve it.

Perhaps the best answer is to add a note to the docs warning about this and suggesting not using pathlib if you are concerned? And document the stat() method, or maybe offer some way to integrate with filecmp?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32612>
_______________________________________


More information about the Python-bugs-list mailing list