[Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info

Ben Hoyt benhoyt at gmail.com
Mon May 13 02:21:36 CEST 2013


> I would prefer to go the other route and don't expose lstat(). It's
> cleaner and less confusing to have a property cached_lstat on the object
> because it actually says what it contains. The property's internal code
> can do a lstat() call if necessary.

Are you suggesting just accessing .cached_lstat could call os.lstat()?
That seems very bad to me. It's a property access -- it looks cheap,
therefore people will expect it to be. From PEP 8 "Avoid using
properties for computationally expensive operations; the attribute
notation makes the caller believe that access is (relatively) cheap."

Even worse is error handling -- I'd expect the expression
"entry.cached_lstat" to only ever raise AttributeError, not OSError in
the case it calls stat under the covers. Calling code would have to
have a try/except around what looked like a simple attribute access.

For these two reasons I think lstat() should definitely be a function.

> Your code example doesn't handle the case of a failing lstat() call. It
> can happen when the file is removed or permission of a parent directory
> changes.

True. My isdir/isfile/islink implementations should catch any OSError
from the lstat() and return False (like os.path.isdir etc do). But
then calling code still doesn't need try/excepts around the isdir()
calls. This is how os.walk() is implemented -- there's no extra error
handling around the isdir() call.

> Why not have both? The os module exposes and leaks the platform details
> on more than on occasion. A low level function can expose name + dirent
> struct on POSIX and name + stat_result on Windows. Then you can build a
> high level API like os.scandir() in pure Python code.

I wouldn't be opposed to that, but it's a scandir() implementation
detail. If there's a scandir_helper_win() and scandir_helper_posix()
written in C, and the rest is written in Python, that'd be fine by me.
As long as the Python part didn't slow it down much.

> The function should use fstatat(2) function (os.lstat with dir_fd) when
> it is available on the current platform. It's better and more secure
> than lstat() with a joined path.

Sure. I'm primarily a Windows dev, so not too familiar with all the
fancy stat* functions. But what you're saying makes sense.

-Ben


More information about the Python-Dev mailing list