[issue22167] iglob() has misleading documentation (does indeed store names internally)

Serhiy Storchaka report at bugs.python.org
Sun May 31 13:50:02 EDT 2020


Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:

Yes, for the pattern 'a*/b*/c*' you will have an open file descriptor for every component with metacharacters:

    for a in scandir('.'):
        if fnmatch(a.name, 'a*'):
            for b in scandir(a.path):
                if fnmatch(b.name, 'b*'):
                    for c in scandir(b.path):
                        if fnmatch(c.name, 'c*'):
                            yield c.path

You can have hundreds nested directories. Looks not bad, because by default the limit on the number of file descriptors is 1024 on Linux. But imagine you run a server and it handles tens requests simultaneously. Some of them or all of them will fail, and not just return an error, but return an incorrect result, because all OSError, including "Too many open files", are silenced in glob().

Also all these file descriptors will not be closed until you finish the iteration, or, in case of error, until the garbage collector close them (because interrupted generators tend to create reference loops).

So it is vital to close the file descriptor before you open other file descriptors in the recursion.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue22167>
_______________________________________


More information about the Python-bugs-list mailing list