[New-bugs-announce] [issue35483] tarfile.extractall on existing symlink in Ubuntu overwrites target file, not symlink, unlinke GNU tar

Michael Brandl report at bugs.python.org
Thu Dec 13 10:22:39 EST 2018


New submission from Michael Brandl <michael.brandl at aid-driving.eu>:

In Ubuntu 16.04, with python 3.5, as well as custom built 3.6 and 3.7.1:

Given a file foo.txt (with content "foo") and a symlink myLink to it, packed in a tar,  and   a file bar.txt (with content "bar") with a symlink myLink to it, packed in another tar,
unpacking the two tars into the same folder (first foo.tar, then bar.tar) leads to the following behavior:

In GNU tar, the directory will contain:
foo.txt (content "foo")
bar.txt (content "bar")
myLink ->bar.txt.

Using python's tarfile however, the result of calling tarfile.extractall on the two tars will give:
foo.txt (content "bar")
bar.txt (content "bar")
myLink ->foo.txt.


Repro: 
1. Unpack the attached symLinkBugRepro.tar.gz into a new folder
2. run > bash repoSymlink.bash (does exactly what is described above)
3. if the last two lines of the output are "bar" and "bar" (instead of "foo" and "bar"), then the content of foo.txt has been overwritten.

Note that this is related to issues like
https://bugs.python.org/issue23228
https://bugs.python.org/issue1167128
https://bugs.python.org/issue19974
https://bugs.python.org/issue10761

None of these issues target the issue at hand, however.

The problem lies in line 2201 of https://github.com/python/cpython/blob/master/Lib/tarfile.py:
The assumption is that any exception only comes from the os not supporting symlinks. But here, the exception comes from the symlink already existing, which should be caught separately. The correct behavior is then NOT to extract the member, but rather to overwrite the symlink (as GNU tar does).

----------
components: Library (Lib)
files: symLinkBugRepro.tar.gz
messages: 331762
nosy: michael.brandl at aid-driving.eu
priority: normal
severity: normal
status: open
title: tarfile.extractall on existing symlink in Ubuntu overwrites target file, not symlink, unlinke GNU tar
type: behavior
versions: Python 3.5, Python 3.6, Python 3.7
Added file: https://bugs.python.org/file47992/symLinkBugRepro.tar.gz

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35483>
_______________________________________


More information about the New-bugs-announce mailing list