Flat file to associative array.

Bengt Richter bokr at oz.net
Tue Jul 22 01:09:18 EDT 2003


On Mon, 21 Jul 2003 01:02:49 -0400, "Alex" <bogus at antispam.com> wrote:

>Hello,
>
>What would be the best way to convert a flat file of products into
>a multi-dimensional array? The flat file looks like this:
>
>Beer|Molson|Dry|4.50
>Beer|Molson|Export|4.50
>Shot|Scotch|Macallan18|18.50
>Shot|Regular|Jameson|3.00
>
>I've managed to read the file properly and get the individual lines
>into a list, but I've got to turn this list into something that can
>be stored in memory like a dictionary / associative array so I can
>properly display the data.
>
The problem is 'way underspecified AFAICS.
Imagine a magic black box that loads the data into itself, never mind
how represented. What do you want to ask of the black box?
(I was going to do an OO approach, but just tweaked your code a bit ;-)

What does "properly display" mean? All the data at once? Selected? How selected?
What do you want displays to look like? Why do you say multi-dimensional array?
Is the input a tree/outline, the way it appears to be?

>Here's the code I have so far:
>
Suggest upgrade to 2.2.2 or better ...
>[Code]
>#! /usr/bin/python
>
>import string
 \_won't need__
>
>file = open("definitions", "r")
 infile = file('definitions')  # file is new name for creating file objects, 'r' is default
>
 ___________________    
/
>while 1:
>    line = file.readline()
>    if not line:
>        break
>    else:
\____________________
 # replace above by
 products = {} # root directory
 for line in infile:
>        data = string.split(line, '|')
         data = line.split('|')
         ... etc see below
 ___________________________________________
/
>        item = dict([('product', data[0])])
>        print "You want a " + item['product'] +
>         ". What kind?  " + data[1] + " for $" +
>        data[3] + " ... right?"
>
\___________________________________________
???

>file.close()
>[/Code]
>

====< flat2tree.py >=========================================
import StringIO
infile = StringIO.StringIO("""\
Beer|Molson|Dry|4.50
Beer|Molson|Export|4.50
Shot|Scotch|Macallan18|18.50
Shot|Regular|Jameson|3.00
This line should be rejected.
Beer|Virtual|c.l.py|Priceless ;-)
""")
# infile = file('definitions')  # file is new name for creating file objects, 'r' is default

def makeTree(infile):
    products = {} # root directory
    for line in infile:
        data = line.strip().split('|')
        if not data: continue
        if len(data)<2:
            print 'Rejecting data line: %r'%line
            continue
        currnode = products
        for kind in data[:-2]:
            currnode = currnode.setdefault(kind, {})
        currnode[data[-2]] = data[-1] # assumes unique kind->price at end of lines
    return products
    
# print sorted outline ?
def pso(node, indent=0):
    if not isinstance(node, dict):
        print '%s%s'% ('  '*indent, node)
    else:
        nodenames = node.keys()
        nodenames.sort()
        for name in nodenames:
            print '%s%s'% ('  '*indent, name)
            pso(node[name], indent+1)

if __name__ == '__main__':
    import sys
    print
    if sys.argv[1:]: infile = file(sys.argv[1])
    products = makeTree(infile)
    infile.close()
    print '\nSorted product outline:'
    pso(products)
    print '\n---- the product dict -----\n', products
=============================================================
Maybe this will give you a start. Running it makes this result:

[22:09] C:\pywk\clp>flat2tree.py

Rejecting data line: 'This line should be rejected.\n'

Sorted product outline:
Beer
  Molson
    Dry
      4.50
    Export
      4.50
  Virtual
    c.l.py
      Priceless ;-)
Shot
  Regular
    Jameson
      3.00
  Scotch
    Macallan18
      18.50

---- the product dict -----
{'Beer': {'Virtual': {'c.l.py': 'Priceless ;-)'}, 'Molson': {'Dry': '4.50', 'Export': '4.50'}}
'Shot': {'Regular': {'Jameson': '3.00'}, 'Scotch': {'Macallan18': '18.50'}}}

[22:09] C:\pywk\clp>

Note that the prices aren't converted to actual numbers, they are still in string form.

Regards,
Bengt Richter




More information about the Python-list mailing list