trouble building data structure

David Alban extasia at extasia.org
Sun Sep 28 20:04:20 EDT 2014


greetings,

i'm writing a program to scan a data file.  from each line of the data file
i'd like to add something like below to a dictionary.  my perl background
makes me want python to autovivify, but when i do:

      file_data = {}

      [... as i loop through lines in the file ...]

          file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }

i get:

Traceback (most recent call last):
  File "foo.py", line 45, in <module>
    file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }
KeyError: '91b152ce64af8af91dfe275575a20489'

what is the pythonic way to build my "file_data" data structure above that
has the above structure?

on http://en.wikipedia.org/wiki/Autovivification there is a section on how
to do autovivification in python, but i want to learn how a python
programmer would normally build a data structure like this.

here is the code so far:

#!/usr/bin/python

import argparse
import os

ASCII_NUL = chr(0)

HOSTNAME = 0
MD5SUM   = 1
FSDEV    = 2
INODE    = 3
NLINKS   = 4
SIZE     = 5
PATH     = 6

file_data = {}

if __name__ == "__main__":
  parser = argparse.ArgumentParser(description='scan files in a tree and
print a line of information about each regular file')
  parser.add_argument('--file', '-f', required=True, help='File from which
to read data')
  parser.add_argument('--field-separator', '-s', default=ASCII_NUL,
help='Specify the string to use as a field separator in output.  The
default is the ascii nul character.')
  args = parser.parse_args()

  file = args.file
  field_separator = args.field_separator

  with open( file, 'rb' ) as f:
    for line in f:
      line = line.rstrip('\n')
      if line == 'None': continue
      fields = line.split( ASCII_NUL )

      hostname = fields[ HOSTNAME ]
      md5sum   = fields[ MD5SUM ]
      fsdev    = fields[ FSDEV ]
      inode    = fields[ INODE ]
      nlinks   = int( fields[ NLINKS ] )
      size     = int( fields[ SIZE ] )
      path     = fields[ PATH ]

      if size < ( 100 * 1024 * 1024 ): continue

      ### print "'%s' '%s' '%s' '%s' '%s' '%s' '%s'" % ( hostname, md5sum,
fsdev, inode, nlinks, size, path, )

      file_data[ md5sum ][ inode ] = { 'path' : path, 'size' : size, }

thanks,
david
-- 
Our decisions are the most important things in our lives.
***
Live in a world of your own, but always welcome visitors.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20140928/2571cf7d/attachment.html>


More information about the Python-list mailing list