Help optimize a script?
Joseph Santaniello
someone at _no-spam_arbitrary.org
Wed Oct 17 13:51:51 EDT 2001
Hello All,
I have a simple script that I wrote to convert some fixed-width delimted
files to tab-delimited.
It works, but some of my files are over 100MB and it takes forever.
First, does anyone know of a tool that does this so I don't have to
reinvent the wheel, and barring that, can anyone offer some tips on how to
optimize this code:
indecies = { 'cob':[3,6,2,2,8,1,8], 'opend':[6,3,3,2,3,4,12,29] }
# above is trimmed for this example
# the lists in these dictionaries above are are the widths of the fields
# in the input files. The keys match the input file names just to keep
things readable.
# while is used cuz line in readlines() used too much ram with
# huge files.
while 1:
line = sys.stdin.readline()
if not line:
break
new = ''
start = 0
for index in indecies[sys.argv[1]]:
new = new + string.strip(line[start:start + index])+'\t'
start = start + index
print new
So it reads in a line, then iterates over the list in he corresponding
dictionary, and prints stripped substrings extracted according the the
field widths in the list, printing a tab between each, then grabs a new
line and does it again.
Any suggestions on how to speed this up?
Thanks,
Joseph
More information about the Python-list
mailing list