[Python-Dev] * and ? in fnmatch

Greg Ward gward@cnri.reston.va.us
Thu, 17 Feb 2000 09:48:43 -0500


Hi all --

I have recently been playing with the fnmatch module, and learned that *
and ? as considered by 'fnmatch.translate()' match *all* characters,
including slashes, colons, backslashes -- in short, whatever happens to
be "special" characters for pathnames on the current platform.

In other words, "foo?bar.py" matches both "foo_bar.py" and "foo/bar.py".

This is not the way any Unix shells that I know of work, nor is it how
the wildcard-expanding MS-DOS system calls that I dimly remember from a
decade or so back worked.  I dunno how wildcard expansion is done under
Windows nowadays, but I wouldn't expect * and ? to match colons or
backslashes there any more than I expect them to match slash under a
Unix shell.

So is this a bug or a feature?

Seems to me that a good fix would be to extend 'fnmatch.translate()' to
have some (maybe all?) of the flags that the standard Unix library
'fnmatch()' supports.  The flag in question here is FNM_PATHNAME, which
is described in the Solaris manual as

     FNM_PATHNAME                  If set, a slash (/)  character
                                   in  string  will be explicitly
                                   matched by a slash in pattern;
                                   it  will  not  be  matched  by
                                   either  the  asterisk  (*)  or
                                   question-mark   (?)    special
                                   characters, nor by  a  bracket
                                   ([]) expression.

                                   If not set, the slash  charac-
                                   ter  is treated as an ordinary
                                   character.

and in the GNU/Linux manual as

       FNM_PATHNAME
              If  this  flag is set, match a slash in string only
              with a slash in pattern and not, for example,  with
              a [] - sequence containing a slash.

To adapt this to Python's 'fnmatch.translate()', I think "slash" would
have to be generalized to "special character", which is platform
dependent:

  Unix             /
  DOS/Windows      : \ (and maybe / too?)
  Mac              :

I propose changing the signature of 'fnmatch.translate()' from

   def translate(pat)

to at least

   def translate(pat,pathname=0)

and possibly to

   def translate(pat,
                 pathname=0, 
                 noescape=0,
                 period=0,
                 leading_dir=0,
                 casefold=0)

which follows the lead of GNU 'fnmatch()'.  (Solaris 'fnmatch()' only
supports the PATHNAME, NOESCAPE, and PERIOD flags; the GNU man page says
LEADING_DIR and CASEFOLD are GNU extensions.  I like GNU extensions.)

Similar optional parameters would be added to 'fnmatch()' and
'fnmatchcase()', possibly dropping the 'casefold' argument since it's
covered by which function you're calling.

I have yet to fully grok the meaning of those other four flags, though,
so I'm not sure how easy it would be to hack them into
'fnmatch.translate()'.

Opinions?

        Greg