3.2 can't extract tarfile produced by 2.7

Ian Kelly ian.g.kelly at gmail.com
Thu Dec 27 14:25:13 EST 2012


On Thu, Dec 27, 2012 at 11:50 AM, Steven W. Orr <steveo at syslang.net> wrote:
> Really? I thought that the whole idea of using "rb" or "wb" was something
> that was necessitated by WinBlo$e. We're not doing IO on a text file here.
> It's a tar file which by definition is binary and it's not clear to me why
> unicode has anything to do with it. The files you extract should be
> unaffected and the archive you produce shouldn't care. Am I missing
> something?

Python 3 uses the 'b' mode to signify that a binary stream should be
opened instead of a text stream.  A binary stream returns bytes when
read from.  A text stream returns strings when read from, which means
that the bytes must be decoded; it also performs optional newline
conversion.  For full details, see the io module documentation.

You're correct that it makes no sense to open a tar file in binary
mode, but the basic io.open constructor has no concept of file type
and relies on the caller to specify the mode properly.  The tarfile
module has its own tarfile.open function which has no "text mode";
this is generally the correct way to open a tar file.  For some reason
the OP is not using this but is instead opening the file with io.open
(in the wrong mode) and then passing the already-opened file object to
tarfile.open.



More information about the Python-list mailing list