#!/usr/bin/python or #!/usr/bin/env python?

Stephan Kuhagen stk at mevis.de
Thu Aug 10 01:20:55 EDT 2006


Erik Max Francis wrote:

> The problem is that there are endless ways to do that, and figuring out
> all the cases makes `file` an sh interpreter, not the magic number
> detector it's supposed to be.

It makes it into a pattern-matcher, not an interpreter. But that it is
already.

But right, there are endless ways to do that, but only one or a very small
subset is common. The way I cited is the way mentioned in the
Tclsh-manpage, so it can be (not must, but can) as the standard-header of a
Tcl-Script on Unix-like systems.

Even if the construct sometimes differs a little bit, "file" should be able
to identify, since the manpage of "file" says, that it is not a pure
magic-number-detector. The first part explains how the magic-number-thing
works and then what is done, when that fails: 

"If  a  file  does not match any of the entries in the magic file, it is
examined to see if it seems to be a text file.
[...]
Once file has determined the character set used in a text-type file, it will
attempt to determine in what language the file is written.  The language
tests look for particular strings (cf names.h) that can appear anywhere in
the first few blocks of a file.  For example, the keyword .br indicates
that the file is most likely a troff(1) input file, just as the keyword
struct indicates a C program.  These tests are less reliable than the
previous two groups, so they are performed last."

This is not the most reliable way, as the man-page says, but it should work:
if in the first some blocks you can find a statement with a continued
comment-line and the exec-trick, "file" can at least guess, that it is a
Tcl-file. So this would be a valid method to detect that files type,
because a troff-file is a troff file, even when there is no .br in the
first few blocks, but "file" tries to identify it anyway. "file" is not
meant to be to ignore troff files, just because they are sometimes hard to
detect. The same applies to Tcl-files, I think. Not perfectly reliable, but
worth a try, since it is, according to the Tclsh-manpage, the common
header-pattern for a Tcl-script.

So "file" can not be perfect and reliable in every case, but it should try
to take a good guess. If you do not care about the Tcl-headers (and why
should you, this is comp.lang.python... ;-), you are right with your
reasoning. But if you accept, that file can not be perfect anyway and want
it to be as good as possible, then it is some kind of bug or missing
feature in "file" that it recognizes (or tries to) some morphing file
formats but not another (which is fairly wide spread, even if Tcl is not a
modern buzz-word-language these days).

Stephan



More information about the Python-list mailing list