[issue39693] tarfile's extractfile documentation is misleading

Josh Rosenberg report at bugs.python.org
Thu Feb 20 02:12:11 EST 2020


New submission from Josh Rosenberg <shadowranger+python at gmail.com>:

The documentation for extractfile ( https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractfile ) says:

"Extract a member from the archive as a file object. member may be a filename or a TarInfo object. If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned."

Before reading further, answer for yourself: What do you think happens when a provided filename doesn't exist, based on that documentation?

In teaching a Python class that uses tarfile in the final project, and expects students to catch predictable errors (e.g. a random tarball being provided, rather than one produced by a different mode of the program with specific expected files) and convert them to user-friendly error messages, I've found this documentation to confuse students repeatedly (if they actually read it, rather than just guessing and checking interactively).

Specifically, the documentation:

1. Says nothing about what happens if member doesn't exist (TarFile.getmember does mention KeyError, but extractfile doesn't describe itself in terms of getmember)
2. Loosely implies that it should return None in such a scenario "If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned." The intent is likely to mean "all other member types are None, and we're saying nothing about non-existent members", but everyone I've taught who has read the docs came away with a different impression until they tested it.

Perhaps just reword from:

"If member is a regular file or a link, an io.BufferedReader object is returned. Otherwise, None is returned."

to:

"If member is a regular file or a link, an io.BufferedReader object is returned. For all other existing members, None is returned. If member does not appear in the archive, KeyError is raised."

Similar adjustments may be needed for extract, and/or both of them could be adjusted to explicitly refer to getmember by stating that filenames are converted to TarInfo objects via getmember.

----------
assignee: docs at python
components: Documentation, Library (Lib)
keywords: easy, newcomer friendly
messages: 362298
nosy: docs at python, josh.r
priority: normal
severity: normal
status: open
title: tarfile's extractfile documentation is misleading
versions: Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39693>
_______________________________________


More information about the Python-bugs-list mailing list