convert script awk in python

Christian Gollwitzer auriocus at gmx.de
Thu Mar 25 03:42:11 EDT 2021


Am 25.03.21 um 00:30 schrieb Avi Gross:
> It [awk] is, as noted, a great tool and if you only had one or a few tools like it
> available, it can easily be bent and twisted to do much of what the others
> do as it is more programmable than most. But following that line of
> reasoning, fairly simple python scripts can be written with python -c "..."
> or by pointing to a script

The thing with awk is that lots of useful text processing is directly 
built into the main syntax; whereas in Python, you can certainly do it 
as well, but it requires to load a library. The simple column summation 
mentioned before by Cameron would be

    awk ' {sum += $2 } END {print sum}'

which can be easily typed into a command line, with the benefit that it 
skips every line where the 2nd col is not a valid number. This is 
important because often there are empty lines, often there is an empty 
line at the end, some ascii headers whatever.

The closest equivalent I can come up with in Python is this:

==============================
import sys

s=0
for line in sys.stdin:
     try:
         s += float(line.split()[1])
     except:
         pass
print(s)
===================================


I don't want to cram this into a python -c " "  line, if it even is 
possible; how do you handle indentation levels and loops??

Of course, for big fancy programs Python is a much better choice than 
awk, no questions asked - but awk has a place for little things which 
fit the special programming model, and there are surprisingly many 
applications where this is just the easiest and fastest way to do the job.

It's like regexes - a few simple characters can do the job which 
otherwise requires a bulky program, but once the parsing gets to certain 
complexity, a true parsing language, or even just handcoded Python is 
much more maintainable.

	Christian

PS: Exercise - handle lines commented out with a '#', i.e. skip those. 
In awk:

gawk '!/^\s*#/ {sum += $2 } END {print sum}'



More information about the Python-list mailing list