[issue26860] os.walk and os.fwalk yield namedtuple instead of tuple

Serhiy Storchaka report at bugs.python.org
Thu Apr 28 04:00:48 EDT 2016


Serhiy Storchaka added the comment:

Sorry, but I disagree with Raymond in many points.

> Classes are normally named with CamelCase.  Also, "walk_result" or "WalkResult" seems like an odd name that doesn't really fit.   DirEntry or DirInfo is a better match (see the OP's example, "for dir_entry in walk_it: ...")

See "stat_result", "statvfs_result", "waitid_result", "uname_result", and "times_result". DirEntry is already used in the os module. And if accept this feature, needed separate types for walk() and fwalk() results.

> The "versionchanged" should be a "versionadded".

os.walk() is not new. Just it's result is changed. Class "walk_result" can be tagged with "versionadded", but I'm not sure there is a need to document it separately. The documentation of the os module already too large. "uname_result" and "times_result" are not documented.

> The docs and code for fwalk() needs to be harmonized with walk() so the the tuple fields use the same names:  change (root, dirs, files) to (dirpath, dirnames, filenames).

(root, dirs, files) is shorter than (dirpath, dirnames, filenames) and these names were used with os.walk() and os.fwalk() for years.

I general, I have doubts about this feature.

1. There is little backward incompatibility. At least pickle is not backward compatible, and I guess other serialization methods.

2. os.walk() and os.fwalk() are purposed to be used in for loop with immediate unpacking result tuple:

    for root, dirs, files in os.walk(...):
        ...

Adding named tuple doesn't add any benefit for common case.

In OP case, you can either use fwalk-based implementation of walk (issue15200):

    def fwalk_as_walk(*args, **kwargs):
        for x in os.fwalk(*args, **kwargs):
            yield x[:-1]

or just ignore the rest of tuple items:

    for root, *_ in walk_it:
        ...

3. Using namedtuple is slower and consumes more memory than using tuple. Even for FS-related operation like os.walk() this can matter. A lot of code is optimized for exact tuples, with namedtuple this optimization is lost.

4. New names (dirpath, dirnames, filenames) are questionable. Why not use undersores (dir_names)? "dir" in dirpath refers to the current proceeded directory, but "dir" in dirnames refers to it's subdirectories. Currently you are free to use short names (root, dirs, files) from examples or what you prefer, but with namedtuple you are sticked with standard names forever. There are no names that satisfy everybody.

5. Third-party walk-like iterators generate tuples, so you can't use attribute access in too general code.

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue26860>
_______________________________________


More information about the Python-bugs-list mailing list