Case insensitive exists()?

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Jan 23 07:06:46 EST 2014


On Wed, 22 Jan 2014 17:58:19 -0700, Larry Martell wrote:

> I have the need to check for a files existence against a string, but I
> need to do case-insensitively. 

Reading on, I see that your database assumes case-insensitive file names, 
while your file system is case-sensitive.

Suggestions:

(1) Move the files onto a case-insensitive file system. Samba, I believe, 
can duplicate the case-insensitive behaviour of NTFS even on ext3 or ext4 
file systems. (To be pedantic, NTFS can also optionally be case-
sensitive, although that it rarely used.) So if you stick the files on a 
samba file share set to case-insensitivity, samba will behave the way you 
want. (Although os.path.exists won't, you'll have to use nt.path.exists 
instead.)

(2) Normalize the database and the files. Do a one-off run through the 
files on disk, lowercasing the file names, followed by a one-off run 
through the database, doing the same. (Watch out for ambiguous names like 
"Foo" and "FOO".) Then you just need to ensure new files are always named 
in lowercase.


Also, keep in mind that just because os.path.exists reports a file exists 
*right now*, doesn't mean it will still exist a millisecond later when 
you go to use it. Consider avoiding os.path.exists altogether, and just 
trying to open the file. (Although I see you still have the problem that 
you don't know *which* directory the file will be found in.

> I cannot efficiently get the name of
> every file in the dir and compare each with my string using lower(), as
> I have 100's of strings to check for, each in a different dir, and each
> dir can have 100's of files in it. Does anyone know of an efficient way
> to do this? There's no switch for os.path that makes exists() check
> case-insensitively is there?

Try nt.path.exists, although I'm not certain it will do what you want 
since it probably assumes the file system is case-insensitive.

It really sounds like you have a hard problem to solve here. I strongly 
recommend that you change the problem, by renaming the files, or at least 
moving them into a consistent location, rather than have to repeatedly 
search multiple directories. Good luck!

-- 
Steven



More information about the Python-list mailing list