Am I not seeing the Error?

Roy Smith roy at panix.com
Wed Aug 14 08:39:25 EDT 2013


In article <mailman.573.1376482061.1251.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Aug 14, 2013 at 7:59 AM, Joshua Landau <joshua at landau.ws> wrote:
> > On 14 August 2013 02:20, Gregory Ewing <greg.ewing at canterbury.ac.nz> wrote:
> >> Ned Batchelder wrote:
> >>>
> >>> Everyone: this program seems to be a direct and misguided transliteration
> >>> from a bash script.
> >>
> >> Not a particularly well-written bash script, either --
> >> it's full of superfluous uses of 'cat'.
> >
> > What's wrong with cat? Sure it's superfluous but what makes it *bad*?
> > Personally I often prefer the pipe "cat x | y" form to "x < y"... or
> > "< y x".
> 
> What's the use of it, in that situation? Why not simply use
> redirection? (Though you have the letters backward; "cat y | x" would
> be the equivalent of your others. Typo, I assume.) You're forking a
> process that achieves nothing, if your cat has just one argument.

This is waaaaayyyy off-topic for a Python discussion, but...

There's two reasons UUOC is a silly issue.  First, it may save human 
effort.  I like to build up long complicated commands and pipelines one 
bit at a time, and look at the intermediate results.  Let's say I'm 
starting with a sed command (abstracted from my current shell history)

$ sed -e 's/.*; iOS/iOS/' -e 's/;.*//' -e 's/\..*//' x

When I want to add the next "-e whatever" to the command, I need to get 
it in front of the "x".  If I had written it as:

$ cat x | sed -e 's/.*; iOS/iOS/' -e 's/;.*//' -e 's/\..*//'

I just have to stick it at the end, which is easier; I just type 
control-p and add what I want.  Or, "!!" and keep typing.  A small 
amount of human convenience (especially when it's mine) is worth a lot 
of wasted CPU time.

Second, in some cases, the extra "cat" process may actually speed up 
overall command execution because it provides additional I/O buffering.  
The cat process will read ahead from the disk file and block only when 
its output pipe buffers are full.  When the sed command is ready to 
process more input, it only has to read from the pipe, not wait for a 
(very slow, by comparison) disk read.  Yeah, I know, modern kernels do 
lots of read-ahead buffing on their own.  This gives you more.

Sure, it costs something to fork/exec another process.  So what?  The 
computer exists to do my bidding, not the other way around.



More information about the Python-list mailing list