[Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

Glenn Linderman v+python at g.nevcal.com
Fri Jun 27 04:43:34 CEST 2014


I'm generally +1, with opinions noted below on these two topics.

On 6/26/2014 3:59 PM, Ben Hoyt wrote:
> Should there be a way to access the full path?
> ----------------------------------------------
>
> Should ``DirEntry``'s have a way to get the full path without using
> ``os.path.join(path, entry.name)``? This is a pretty common pattern,
> and it may be useful to add pathlib-like ``str(entry)`` functionality.
> This functionality has also been requested in `issue 13`_ on GitHub.
>
> .. _`issue 13`:https://github.com/benhoyt/scandir/issues/13

+1

> Should it expose Windows wildcard functionality?
> ------------------------------------------------
>
> Should ``scandir()`` have a way of exposing the wildcard functionality
> in the Windows ``FindFirstFile`` / ``FindNextFile`` functions? The
> scandir module on GitHub exposes this as a ``windows_wildcard``
> keyword argument, allowing Windows power users the option to pass a
> custom wildcard to ``FindFirstFile``, which may avoid the need to use
> ``fnmatch`` or similar on the resulting names. It is named the
> unwieldly ``windows_wildcard`` to remind you you're writing power-
> user, Windows-only code if you use it.
>
> This boils down to whether ``scandir`` should be about exposing all of
> the system's directory iteration features, or simply providing a fast,
> simple, cross-platform directory iteration API.
>
> This PEP's author votes for not including ``windows_wildcard`` in the
> standard library version, because even though it could be useful in
> rare cases (say the Windows Dropbox client?), it'd be too easy to use
> it just because you're a Windows developer, and create code that is
> not cross-platform.

Because another common pattern is to check for name matches pattern, I 
think it would be good to have a feature that provides such. I do that 
in my own private directory listing extensions, and also some command 
lines expose it to the user.  Where exposed to the user, I use -p 
windows-pattern and -P regexp. My implementation converts the 
windows-pattern to a regexp, and then uses common code, but for this 
particular API, because the windows_wildcard can be optimized by the 
window API call used, it would make more sense to pass windows_wildcard 
directly to FindFirst on Windows, but on *nix convert it to a regexp. 
Both Windows and *nix would call re to process pattern matches except 
for the case on Windows of having a Windows pattern passed in. The 
alternate parameter could simply be called wildcard, and would be a 
regexp. If desired, other flavors of wildcard bsd_wildcard? could also 
be implemented, but I'm not sure there are any benefits to them, as 
there are, as far as I am aware, no optimizations for those patterns in 
those systems.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140626/b71d7c18/attachment.html>


More information about the Python-Dev mailing list