convert script awk in python

Avi Gross avigross at verizon.net
Fri Mar 26 21:06:19 EDT 2021


Michael,

A generator that opens one file at a time (or STDIN) in a consistent manner,
would be a reasonable thing to have as part of emulating AWK.

As I see it, you may want a bit more that includes having it know how to
parse each line it reads into some version of names that in Python might not
be $1 and $2 types of names but may be an array of strings with the complete
line perhaps being in array[0] and each  of the parts.

Clearly you would place whatever equivalent BEGIN statements in your code
above the call to the generator  then have something like a for loop
assigning the result of the generator to a variable and your multiple
condition/action parts in the loop. You then have the END outside the loop.

But it is far from as simple as that to emulate what AWK does such as
deciding whether you stop matching patterns once the first match is found
and executed. As I noted, some AWK features do not line up with normal
python such as assuming variables not initialized are zero or "" depending
on context. There may well be scoping issues and other things to consider.
And clearly you need to do things by hand if you want a character string to
be treated as an integer, ...

But all fairly doable, albeit not sure an easy translation between an AWK
script into python is trivial, or even a good idea. 

You could do a similar concept with other utilities like sed or grep or
other such filter utilities where the same generator, or a variant, might
automate things. I am pretty sure some module or other has done things like
this.

It is common in a language like PERL to do something like this:

while(<>)
{
  # get rid of the pesky newline character
  chomp;

  # read the fields in the current record into an array
  @fields = split(':', $_);

# DO stuff
}

The <> diamond operator is a sort of generator that reads in a line at a
time from as many files as needed and sticks it in $_ by default and then
you throw away the newline and split the line and then do what you wish
after that. No reason python cannot have something similar, maybe more
wordy.

Disclaimer: I am not suggesting people use AWK or PERL or anything else. The
focus is if people come from other programming environments and are looking
at how to do common tasks in python.


-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon.net at python.org> On
Behalf Of Michael Torrie
Sent: Friday, March 26, 2021 8:32 PM
To: python-list at python.org
Subject: Re: convert script awk in python

On 3/25/21 1:14 AM, Loris Bennett wrote:
> Does any one have a better approach?

Not as such.  Running a command and parsing its output is a relatively
common task. Years ago I wrote my own simple python wrapper function that
would make it easier to run a program with arguments, and capture its
output.  I ended up using that wrapper many times, which saved a lot of
time.

When it comes to converting a bash pipeline process to Python, it's worth
considering that most of pipelines seem to involve parsing using sed or awk
(as yours do), which is way easier to do from python without that kind of
pipelining. However there is a fantastic article I read years ago about how
generators are python's equivalent to a pipe.
Anyone wanting to replace a bash script with python should read this:

https://www.dabeaz.com/generators/Generators.pdf

Also there's an interesting shell scripting language based on Python called
xonsh which makes it much easier to interact with processes like bash does,
but still leveraging Python to process the output.
https://xon.sh/ .
--
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list