[Tutor] newbie question about grep in Python

Karl Pflästerer sigurd at 12move.de
Tue Feb 10 19:25:10 EST 2004


On 10 Feb 2004, Eric Culpepper <- eculpepper at hcc-care.com wrote:


> I'm trying my best to learn Python and I'm going through some shell scripts
> attempting to write Python versions of these scripts. I stumbled on this piece
> of a script and I'm struggling to figure out a method of doing this in Python.

You know that sometimes a shell script is the best tool for such jobs so
Python won't show all its beauty there.

> #!/bin/sh
> o_date=`strings /shc1/$1/$2 2> /dev/null | grep Date | egrep "\#V\#|\@\(\#\)" \
>         | sed 's/^\\$//' | awk 'BEGIN {FS="$"} {printf("%s\n",substr($2,6))}'`
> echo "o_date is $o_date"

That has nothing to do with Python but the script is unnecessary
complicated: just pipe the output from strings directly to awk; use a
pattern like 

awk -F "$" /^\#V\#\$Date:.*\$\#V\#/ {
  print substr(gensub(/[a-zA-Z]+/, "", "g", $2), 2)
    }

might do the job (it might possible to write it shorter if I knew better
which kind of patterns may arise but that's not the point here).  Write
it as shell function and you're fine.


Back to Python:

[...]
> cAModule  $Header:rxrmon.s, 59, 1/8/04 3:38:55 PM, Roxanne Curtiss$
> c SCCS Library $Workfile:rxrmon.s$
> Version $Revision:59$
> Date $Date:1/8/04 3:38:55 PM$
> #V#$Revision:59$#V#
> #V#$Date:1/8/04 3:38:55 PM$#V#
> - RXRMON - 2jan87 - RM

If the output looks always like that (beginning with #V# and ending with
it) you could write:

********************************************************************
import os, sys, re

fil = sys.argv[1]
reg = re.compile('#V#\$Date:(.*)#V#')
strings = os.popen('strings ' + fil)

for line in strings:
    m = reg.search(line)
    if m:
        print re.sub('[a-zA-Z$]', '', m.groups()[0])
        break
    
strings.close()
********************************************************************

Above opens a pipe to the output from strings (you call that script with
one argument: the name of the file to parse); it iterates over the lines,
tries to find one which matches the given regexp (it looks a bit like
the one you used for egrep); if a match is found, we have a match
object; its groups() method returns a tuple with the matching groups of
the regexp (we have here only one group, the part in parentheses).  The
group encloses some strings we don't want; they are replaced and the
changed string is printed.

If a more specifical regexp is possible the need to change the group may
escape.

But again: that's not the primary target of Python.  Hacking on he
command line is sometimes better with awk, sed and friends.



   Karl
-- 
Please do *not* send copies of replies to me.
I read the list




More information about the Tutor mailing list