parsing question

Tim Chase python.list at
Mon May 31 11:07:02 EDT 2010

On 05/31/2010 08:42 AM, Mag Gam wrote:
> I have a file with bunch of nfsstat -c (on AIX) which has all the
> hostnames, for example
> Is there a an easy way to parse this file according to each host?
> So,
> r1svr.Connectionless.calls=6553
> r1svr.Connectionless.badcalls=0
> and so on...
> I am currently using awk which I am able to get what I need, but
> curious if in python how people handle block data.

Since you already profess to having an awk solution, I felt it 
was okay to at least take a stab at my implementation (rather 
than doing your job for you :).  Without a complete spec for the 
output, it's a bit of guesswork, but I got something fairly close 
to what you want.  It uses nested dictionaries which mean the 
keys and values have to be referenced like


and the values are strings (I'm not sure what you want in the 
case of the data that has both a value and percentage) not 

That said, this should get you fairly close to what you describe:


import re
header_finding_re = re.compile(r'\b\w{2,}')
version_re = re.compile(r'^Version (\d+):\s*\(.*\)$', re.I)
servers = {}
server = client = orig_client = subtype = None
source = file('data.txt')
for line in source:
   line = line.rstrip('\r\n')
   if not line.strip(): continue
   if line.startswith('='*5) and line.endswith('='*5):
     server = line.strip('=')
     client = orig_client = subtype = None
   elif line.startswith(CLIENT_HEADER):
     orig_client = client = line[len(CLIENT_HEADER):-1]
     subtype = 'all'
   elif line.startswith(CONNECTION_HEADER):
     subtype = line.replace(' ', '').lower()
   else: # it's a version or header row
     m = version_re.match(line)
     if m:
       subtype = "v" +
       if None in (server, client, subtype):
         print "Missing data", repr((server, client, subtype))
       dest = servers.setdefault(server, {}
         ).setdefault(client, {}
         ).setdefault(subtype, {})
       data =
       row = header_finding_re.finditer(line)
       prev =
       for header in row:
         key =
         value = data[prev.start():header.start()].strip()
         prev = header
         dest[key] = value
       key =
       value = data[prev.start():].strip()
       dest[key] = value

for server, clients in servers.items():
   for client, subtypes in clients.items():
     for subtype, kv in subtypes.items():
       for key, value in kv.items():
         print ".".join([server, client, subtype, key]),
         print '=', value


Have fun,


More information about the Python-list mailing list