Python 3 is killing Python

Chris Angelico rosuav at gmail.com
Wed Jul 16 08:55:54 EDT 2014


On Wed, Jul 16, 2014 at 10:10 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Linux, like all Unixes, is primarily a text-based platform. With a few
> exceptions, /etc is filled with text files, not binary files, and half
> the executables on the system are text (Python, Perl, bash, sh, awk,
> etc.).

An interesting assertion. I know "half" is not meant to be an actual
estimate, but out of curiosity, I whipped up a quick script to figure
out just how many of my executables are text and how many aren't.

#!/usr/bin/env python3
import os, subprocess
text = binary = unknown = unreadable = 0
for path in os.environ["PATH"].split(":"):
    for file in os.listdir(path):
        fn = os.path.join(path, file)
        try:
            t = subprocess.check_output(["file", "-L", fn])
        except subprocess.CalledProcessError:
            print("Unreadable: %s" % fn)
            unreadable += 1
            continue
        if isinstance(t, bytes): t = t.decode("ascii")
        # Now to try to figure out what's text and what's binary.
        if "text" in t:
            # Most Unixes follow the convention of having "text" in
            # the output of all files that can be safely blatted to
            # a terminal - for instance, "ASCII text executable" is
            # used to describe most shell scripts etc; this file is
            # a "Python script, ASCII text executable". If I put in
            # a non-ASCII char, the 'file' descr becomes changes to
            # "Python script, UTF-8 Unicode text executable".
            text += 1
        elif "directory" in t:
            # Ignore directories.
            pass
        elif "LSB executable" in t or "LSB shared object" in t:
            binary += 1
        else:
            print(t.strip())
            unknown += 1
print("%d text, %d binary" % (text, binary))
if unknown: print("Also %d unknowns, which are probably binary." % unknown)
if unreadable: print("Plus %d files that couldn't be read." % unreadable)


On my system, it says:
rosuav at sikorsky:~$ python3 exectypes.py
/usr/local/bin/youtube-dl: data
Unreadable: /usr/bin/wine-safe
/usr/bin/mptopdf: LaTeX auxiliary file,
/usr/bin/gvfs-less: Palm OS dynamic library data "#!/bin/sh"
Unreadable: /usr/bin/gserialver
1140 text, 2060 binary
Also 3 unknowns, which are probably binary.
Plus 2 files that couldn't be read.

So a bit more than a third of my executables are text. That's a pretty
high proportion, and not very far off the rough guesstimate of half.
(And I tried this on three other Linuxes I have around the house,
getting broadly the same proportion, although the numbers are quite
different.)

ChrisA



More information about the Python-list mailing list