[Python-Dev] Updates to PEP 471, the os.scandir() proposal

Ethan Furman ethan at stoneleaf.us
Tue Jul 8 22:22:33 CEST 2014


On 07/08/2014 12:34 PM, Ben Hoyt wrote:
>>
>> Better to just have the attributes be None if they were not fetched.  None
>> is better than hasattr anyway, at least in the respect of not having to
>> catch exceptions to function properly.
>
> The thing is, is_dir() and lstat() are not attributes (for a good
> reason). Please read the relevant "Rejected ideas" sections and let us
> know what you think. :-)

I did better than that -- I read the whole thing!  ;)

-1 on the PEP's implementation.

Just like an attribute does not imply a system call, having a method named 'is_dir' /does/ imply a system call, and not 
having one can be just as misleading.

If we have this:

     size = 0
     for entry in scandir('/some/path'):
         size += entry.st_size

   - on Windows, this should Just Work (if I have the names correct ;)
   - on Posix, etc., this should fail noisily with either an AttributeError
     ('entry' has no 'st_size') or a TypeError (cannot add None)

and the solution is equally simple:

     for entry in scandir('/some/path', stat=True):

   - if not Windows, perform a stat call at the same time

Now, of course, we might get errors.  I am not a big fan of wrapping everything in try/except, particularly when we 
already have a model to follow -- os.walk:

     for entry in scandir('/some/path', stat=True, onerror=record_and_skip):

If we don't care if an error crashes the script, leave off onerror.

If we don't need st_size and friends, leave off stat=True.

If we get better performance on Windows instead of Linux, that's okay.

scandir is going into os because it may not behave the same on every platform.  Heck, even some non-os modules 
(multiprocessing comes to mind) do not behave the same on every platform.

I think caching the attributes for DirEntry is fine, but let's do it as a snapshot of that moment in time, not name now, 
and attributes in 30 minutes when we finally get to you because we had a lot of processing/files ahead of you (you being 
a DirEntry ;) .

--
~Ethan~


More information about the Python-Dev mailing list