[issue22208] tarfile can't add in memory files (reopened)

Mark Grandi report at bugs.python.org
Thu Aug 21 00:05:27 CEST 2014


Mark Grandi added the comment:

> I don't have an idea how to make it easier and still meet all/most requirements and without cluttering up the api.

That is what i mentioned in my original post, a lot of the time users just _don't care_ about a lot of the stuff that a tar archive can store (permission bits, uid/gid, etc).

Say i'm on my mac. I can select a bunch of files and then right click -> compress. Pretending that it saves the resulting archive as a .tar.gz rather then a .zip, that's really it. The user doesn't care about the permission bits, uid/gid or any of that, they just want a compressed archive.

While the api does do a good job of exposing the low level parts of the api with TarInfo, being able to set all the stuff manually or have it figured out through gettarinfo() calling os.stat()

My original reasoning for this bug report is that its way too hard to do it for in-memory files, as those don't have file descriptors so os.stat() fails. But why can't we just make it so:

gettarinfo() is called
    * if it's a regular file, it continues as it does not
    * if it is NOT a regular file (no file descriptor), then it returns a TarInfo object with the 'name' and 'size' set, and the rest of the fields set to default values (the current user's uid and gid, acceptable permission bits and the correct type for a regular file (REGFILE?)
        * if gettarinfo() is called with a non regular file and it's name has a slash, then its assumed to be a folder structure, so then it will add the correct TarInfo with type = DIRTYPE and then insert the file underneath that folder, sorta how zipfile works. I looked at the tarfile.py code and it seems it does this already. 


This just adds the needed "easy use case" for the tarfile module, as the complicated low level api is there, we just need something that users just want to create an archive without worrying too much about the low level stuff. So then they can just:

import tarfile, io

fileToAdd = io.BytesIO("hello world!".encode("utf-8"))
with tarfile.open("sample.tar", mode="w") as tar:

    # this TarInfo object has:
    #    name = 'somefile.txt'
    #    type = REGTYPE (or whatever is 'just a regular file')
    #    uid = 501, gid = 20, gname=staff, uname=markgrandi, mode=644
    tmpTarInfo = tar.gettarinfo("somefile.txt", fileToAdd)
    tar.addfile()



So basically its just having defaults for the TarInfo object when gettarinfo() is given a file that doesn't have a file descriptor.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22208>
_______________________________________


More information about the Python-bugs-list mailing list