[Tutor] Help with strings and lists.

Alan Collins online330983 at telkomsa.net
Fri Jul 14 03:18:04 CEST 2006


Hi,

I do a far bit of data manipulation and decided to try one of my 
favourite utilities in Python. I'd really appreciate some optimization 
of the script. I'm sure that I've missed many tricks in even this short 
script.

Let's say you have a file with this data:

Monday 7373 3663657 2272 547757699 reached 100%
Tuesday 7726347 552 766463 2253 under-achieved 0%
Wednesday 9899898 8488947 6472 77449 reached 100%
Thursday 636648 553 22344 5699 under-achieved 0%
Friday 997 3647757 78736632 357599 over-achieved 200%

You now want columns 1, 5, and 7 printed and aligned (much like a 
spreadsheet). For example:

Monday    547757699 100%
Wednesday     77449 100%
...

This script does the job, but I reckon there are better ways.  In the 
interests of brevity, I have dropped the command-line argument handling 
and hard-coded the columns for the test and I hard-coded the input file 
name.

-------------------------------------------------------
"""
PrintColumns

Print specified columns, alignment based on data type.

The script works by parsing the input file twice.  The first pass gets 
the maximum length of
all values on the columns.  This value is used to pad the column on the 
second pass.

"""
import sys

columns = [0]     # hard-code the columns to be printed.
colwidth = [0]          # list into which the maximum field lenths will 
be stored.

"""
This part is clunky.  Can't think of another way to do it without making 
the script
somewhat longer and slower. What it does is that if the user specifies 
column 0, all
columns will be printed.  This bit builds up the list of columns, from 1 
to 100.
"""

if columns[0] == 0:
     columns = [1]
     while len(columns) < 100:
         columns.append(len(columns)+1)

"""
First pass.  Read all lines and determine the maximum width of each 
selected column.
"""
infile = file("mylist", "r")
indata = infile.readlines()
for myline in indata:
     mycolumns = myline.split()
     colindex = 0
     for column in columns:
         if column <= len(mycolumns):
             if len(colwidth)-1 < colindex:
                 colwidth.append(len(mycolumns[column-1]))
             else:
                 if colwidth[colindex] < len(mycolumns[column-1]):
                     colwidth[colindex] = len(mycolumns[column-1])
         colindex += 1
infile.close()

"""
Second pass. Read all lines and print the selected columns.  Text values 
are left
justified, while numeric values are right justified.
"""
infile = file("mylist", "r")
indata = infile.readlines()
for myline in indata:
     mycolumns = myline.split()
     colindex = 0
     for column in columns:
         if column <= len(mycolumns):
             if mycolumns[column-1].isdigit():
                 x = mycolumns[column-1].rjust(colwidth[colindex]) + ' '
             else:
                 x = mycolumns[column-1].ljust(colwidth[colindex]+1)
             print x,
         colindex += 1
     print ""
infile.close()
-------------------------------------------------------

Any help greatly appreciated.
Regards,
Alan.


More information about the Tutor mailing list