[Python-Dev] * and ? in fnmatch
Greg Ward
gward@cnri.reston.va.us
Thu, 17 Feb 2000 09:48:43 -0500
Hi all --
I have recently been playing with the fnmatch module, and learned that *
and ? as considered by 'fnmatch.translate()' match *all* characters,
including slashes, colons, backslashes -- in short, whatever happens to
be "special" characters for pathnames on the current platform.
In other words, "foo?bar.py" matches both "foo_bar.py" and "foo/bar.py".
This is not the way any Unix shells that I know of work, nor is it how
the wildcard-expanding MS-DOS system calls that I dimly remember from a
decade or so back worked. I dunno how wildcard expansion is done under
Windows nowadays, but I wouldn't expect * and ? to match colons or
backslashes there any more than I expect them to match slash under a
Unix shell.
So is this a bug or a feature?
Seems to me that a good fix would be to extend 'fnmatch.translate()' to
have some (maybe all?) of the flags that the standard Unix library
'fnmatch()' supports. The flag in question here is FNM_PATHNAME, which
is described in the Solaris manual as
FNM_PATHNAME If set, a slash (/) character
in string will be explicitly
matched by a slash in pattern;
it will not be matched by
either the asterisk (*) or
question-mark (?) special
characters, nor by a bracket
([]) expression.
If not set, the slash charac-
ter is treated as an ordinary
character.
and in the GNU/Linux manual as
FNM_PATHNAME
If this flag is set, match a slash in string only
with a slash in pattern and not, for example, with
a [] - sequence containing a slash.
To adapt this to Python's 'fnmatch.translate()', I think "slash" would
have to be generalized to "special character", which is platform
dependent:
Unix /
DOS/Windows : \ (and maybe / too?)
Mac :
I propose changing the signature of 'fnmatch.translate()' from
def translate(pat)
to at least
def translate(pat,pathname=0)
and possibly to
def translate(pat,
pathname=0,
noescape=0,
period=0,
leading_dir=0,
casefold=0)
which follows the lead of GNU 'fnmatch()'. (Solaris 'fnmatch()' only
supports the PATHNAME, NOESCAPE, and PERIOD flags; the GNU man page says
LEADING_DIR and CASEFOLD are GNU extensions. I like GNU extensions.)
Similar optional parameters would be added to 'fnmatch()' and
'fnmatchcase()', possibly dropping the 'casefold' argument since it's
covered by which function you're calling.
I have yet to fully grok the meaning of those other four flags, though,
so I'm not sure how easy it would be to hack them into
'fnmatch.translate()'.
Opinions?
Greg