[Python-Dev] Updates to PEP 471, the os.scandir() proposal
Ben Hoyt
benhoyt at gmail.com
Wed Jul 9 15:22:41 CEST 2014
> Option 2:
> def log_err(exc):
> logger.warn("Cannot stat {}".format(exc.filename))
>
> def get_tree_size(path):
> total = 0
> for entry in os.scandir(path, info='lstat', onerror=log_err):
> if entry.is_dir:
> total += get_tree_size(entry.full_name)
> else:
> total += entry.lstat.st_size
> return total
>
> On this basis, #2 wins.
That's a pretty nice comparison, and you're right, onerror handling is
nicer here.
> However, I'm slightly uncomfortable using the
> filename attribute of the exception in the logging, as there is
> nothing in the docs saying that this will give a full pathname. I'd
> hate to see "Unable to stat __init__.py"!!!
Huh, you're right. I think this should be documented in os.walk() too.
I think it should be the full filename (is it currently?).
> So maybe the onerror function should also receive the DirEntry object
> - which will only have the name and full_name attributes, but that's
> all that is needed.
That's an interesting idea -- though enough of a deviation from
os.walk()'s onerror that I'm uncomfortable with it -- I'd rather just
document that the onerror exception .filename is the full path name.
One issue with option #2 that I just realized -- does scandir yield
the entry at all if there's a stat error? It can't really, because the
caller will except the .lstat attribute to be set (assuming he asked
for type='lstat') but it won't be. Is effectively removing these
entries just because the stat failed a problem? I kind of think it is.
If so, is there a way to solve it with option #2?
> OK, looks like option #2 is now my preferred option. My gut instinct
> still rebels over an API that deliberately throws information away in
> the default case, even though there is now an option to ask it to keep
> that information, but I see the logic and can learn to live with it.
In terms of throwing away info "in the default case" -- it's simply a
case of getting what you ask for. :-) Worst case, you'll write your
code and test it, it'll fail hard on any system, you'll fix it
immediately, and then it'll work on any system.
-Ben
More information about the Python-Dev
mailing list