preparing data for visualization

Mon Nov 12 18:53:39 EST 2007

On Nov 12, 5:12 pm, John Machin <sjmac... at lexicon.net> wrote:
> Bryan.Fodn... at gmail.com wrote:
> > I would like to have my data in a format so that I can create a
> > contour plot.
>
> > My data is in a file with a format, where there may be multiple fields
>
> > field = 1
>
> > 1a      0
> > 2a      0
>
> The above is NOT consistent with the later listing of your data file.
>
> [big snip
>
> > 10b     0
>
> > where the value is how far from the center it will be displaced,
>
> >                     a             b                     a
> > b                      a             b
> > 10      0000000000|0000000000   0000000000|0000000000   0000000000|
>
> [big snip of seemingly irrelevant stuff]
>
>
>
>
>
> > 1        0000000000|0000000000   0000000000|0000000000   0000000000|
> > 0000000000
>
> > I could possibly have many of these that I will add together and
> > normalize to one.
>
> > Also, there are 60 a and b blocks, the middle 40 are 0.5 times the
> > width of the outer 20.
>
> > I thought about filling an array, but there is not a one to one
> > symmetry.
>
> > I cannot seem to get my head around this. Can anybody help me get
> > started?
>
> > I have tried to use a dictionary, but cannot seem to get it to work
> > the way I want.
>
> > I try this,
>
> > ---------------------------------------------------------------------------------
>
> > f = open('TEST1.MLC')
>
> > fields = {}
>
> > for line in f:
> >    if line.split()[0] == 'Field':
>
> >        field = int(line.split()[-1])
>
> Do line.split() ONCE per line.
>
>
>
> >    elif line.split()[0] == 'Leaf':
> >        fields[field] = line.split()[-1]
> >    else:
> >        line = f.next()
>
> Don't mix
>     for line in f
> and
>     line = f.next()
> otherwise you will you will skip lines that you don't want to skip and/
> or become confused.
>
>
>
> > ---------------------------------------------------------------------------------
>
> > and get,
>
> > ---------------------------------------------------------------------------------
>
> > Traceback (most recent call last):
> >  File "<pyshell#1>", line 1, in <module>
> >    line.split()[0]
> > IndexError: list index out of range
>
> This indicates that you have a list for which 0 is not a valid index.
> If it had 1 or more elements, then 0 would be a valid index. I
> conclude that the list is empty. This would happen if line contained
> no characters other than whitespace.

In other words, the blank lines between the blocks of data.

>
>
>
>
>
> > Here is my data file,
>
> > ---------------------------------------------------------------------------------
>
> > File Rev = G
> > Treatment = Dynamic Dose
> > Last Name = Fodness
> > First Name = Bryan
> > Patient ID = 0001
> > Number of Fields = 4
> > Number of Leaves = 120
> > Tolerance = 0.50
>
> > Field = 10
> > Index = 0.0000
> > Carriage Group = 1
> > Operator =
> > Collimator = 0.0
> > Leaf  1A =   0.00
> > Leaf  2A =   0.00
> [snip]
> > Leaf 20A =   0.00
> > Leaf 21A =   5.00
> > Leaf 22A =   5.00
> [snip]
> > Leaf 40A =   5.00
>
> [big snip -- your code failed no later than the 10th line in the data
> file]
> To find out what is going on, print out some variables:
>
> 8<---  fodness.py ----
> f = open('fodness.dat')
> fields = {}
> for lino, line in enumerate(f):
>     tokens = line.split()
>     print "Line %d: tokens = %r" % (lino, tokens)
>     if not tokens:
>         continue # blank/empty line
>     tok0 = tokens[0]
>     if tok0 == 'Field':
>         field = int(tokens[-1])
>     elif tok0 == 'Leaf':
>         fields[field] = tokens[-1]
>     else:
>         continue
>     print "   Fields:", fields
> 8<---
>
> Results [truncated]:
>
> C:\junk>fodness.py | more
> Line 0: tokens = []
> Line 1: tokens = ['File', 'Rev', '=', 'G']
> [snip]
> Line 8: tokens = ['Tolerance', '=', '0.50']
> Line 9: tokens = []
> Line 10: tokens = ['Field', '=', '10']
>    Fields: {}
> Line 11: tokens = ['Index', '=', '0.0000']
> Line 12: tokens = ['Carriage', 'Group', '=', '1']
> Line 13: tokens = ['Operator', '=']
> Line 14: tokens = ['Collimator', '=', '0.0']
> Line 15: tokens = ['Leaf', '1A', '=', '0.00']
>    Fields: {10: '0.00'} <<<<<<<<<<====== Don't you need a float
> instead of a string??
> Line 16: tokens = ['Leaf', '2A', '=', '0.00']
>    Fields: {10: '0.00'}
> Line 17: tokens = ['Leaf', '3A', '=', '0.00']
>    Fields: {10: '0.00'}
> Line 18: tokens = ['Leaf', '4A', '=', '0.00']
>    Fields: {10: '0.00'}
>

Yep, crahing on blank lines.

f = open(r'C:\python25\user\MLC\TEST1.MLC')
fields = {}
for line in f:
  the_line = line.split()           # split only once
  if the_line:                      # test if the_line empty
    if the_line[0] == 'Field':      # if not, start checking
      field = int(the_line[-1])
    elif the_line[0] == 'Leaf':
      fields[field] = the_line[-1]
    ## f.next() removed

> Don't you want/need to use the leaf IDs (1A, 2A, etc)?? I guess that
> you want to end up with NESTED dictonaries, like this:
> fields = {
>     10: {
>         '1A': 0.0,
>         '2A': 0.0,
>         etc,
>         },
>     8: {
>         etc,
>         },
>     etc,
>     }

The fixed code returns only one leaf, since you are
continuously overwriting it.
>>> fields
{8: '0.00', 1: '0.00', 10: '0.00', 4: '0.00'}

Note also that the values are still strings, you probably
have to fix that.

>
> HTH,
> John- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -