[Python-ideas] pathlib suggestions

Franklin? Lee leewangzhong+python at gmail.com
Wed Jan 25 15:40:05 EST 2017


> A ".tar.gz" is not the same as a ".svg.gz".  The fact that they are both
> gzip-compressed is an implementation detail as far as most software I deal
> with is concerned.  My unarchiver will extract a ".tar.gz" into a directory
> as if it was just a ".tar", while my image viewer will view a ".svg.gz" as a
> vector image as if it was just a ".svg".  From a user-interaction
> standpoint, the ".gz" part is ignored.

Just to be sure we're on the same page:
- A .tar file is an uncompressed bundle of files.
- A .gz file is a compressed version of a single file.
- Technically, there's no such thing as a .tar.gz file. "x.tar.gz"
means that if you unwrap it with gunzip, you'll get a file called
"x.tar", which you can then unpack with tar.

"x.tar.gz" is not a tar file using the gzip compression. It's a gz
file which unpacks to a tar file. Conceptually, your unarchiver does
it in two separate steps.

Similarly, "x.svg.gz" is a gz file which unpacks to an svg file. Your
viewer just knows to unzip it before use.

I don't wanna appear as a naysayer, so here's an alternative
suggestion: A parameter for a collection of "extension suffixes". The
function will try to eat extensions from the end until it finds one
NOT on the list (or it runs out). The docs can recommend `('gz', 'xz',
'bz', 'bz2', ...)`. Maybe a later Python version can use that
recommendation as the default.

IMO, ".part1" is not a part of the extension. You'd usually have
"x.part1.rar" and "x.part2.rar" in the same folder, and it makes more
sense that there are two files with base names "x.part1" and "x.part2"
than to have two different files with the same base name and an
extension which just keeps them ordered.


More information about the Python-ideas mailing list