Python 3 is killing Python
Chris Angelico
rosuav at gmail.com
Wed Jul 16 08:55:54 EDT 2014
On Wed, Jul 16, 2014 at 10:10 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Linux, like all Unixes, is primarily a text-based platform. With a few
> exceptions, /etc is filled with text files, not binary files, and half
> the executables on the system are text (Python, Perl, bash, sh, awk,
> etc.).
An interesting assertion. I know "half" is not meant to be an actual
estimate, but out of curiosity, I whipped up a quick script to figure
out just how many of my executables are text and how many aren't.
#!/usr/bin/env python3
import os, subprocess
text = binary = unknown = unreadable = 0
for path in os.environ["PATH"].split(":"):
for file in os.listdir(path):
fn = os.path.join(path, file)
try:
t = subprocess.check_output(["file", "-L", fn])
except subprocess.CalledProcessError:
print("Unreadable: %s" % fn)
unreadable += 1
continue
if isinstance(t, bytes): t = t.decode("ascii")
# Now to try to figure out what's text and what's binary.
if "text" in t:
# Most Unixes follow the convention of having "text" in
# the output of all files that can be safely blatted to
# a terminal - for instance, "ASCII text executable" is
# used to describe most shell scripts etc; this file is
# a "Python script, ASCII text executable". If I put in
# a non-ASCII char, the 'file' descr becomes changes to
# "Python script, UTF-8 Unicode text executable".
text += 1
elif "directory" in t:
# Ignore directories.
pass
elif "LSB executable" in t or "LSB shared object" in t:
binary += 1
else:
print(t.strip())
unknown += 1
print("%d text, %d binary" % (text, binary))
if unknown: print("Also %d unknowns, which are probably binary." % unknown)
if unreadable: print("Plus %d files that couldn't be read." % unreadable)
On my system, it says:
rosuav at sikorsky:~$ python3 exectypes.py
/usr/local/bin/youtube-dl: data
Unreadable: /usr/bin/wine-safe
/usr/bin/mptopdf: LaTeX auxiliary file,
/usr/bin/gvfs-less: Palm OS dynamic library data "#!/bin/sh"
Unreadable: /usr/bin/gserialver
1140 text, 2060 binary
Also 3 unknowns, which are probably binary.
Plus 2 files that couldn't be read.
So a bit more than a third of my executables are text. That's a pretty
high proportion, and not very far off the rough guesstimate of half.
(And I tried this on three other Linuxes I have around the house,
getting broadly the same proportion, although the numbers are quite
different.)
ChrisA
More information about the Python-list
mailing list