Parse each line by character location

Tim Chase python.list at tim.thechases.com
Tue Nov 4 14:36:29 EST 2008


>>   recno_idx = slice(0,10)
>>   client_idx = slice(10, 11)
>>   volume_idx = slice(11,11+10)
>>   order_type_idx = slice(11+10, 11+10+3)
> 			.
> !?  That seems to me confusingly far from a working solution,
> at least in comparison to
> 
>     recno_idex = the_line[0:10]
>     client_idx = the_line[10:11]
> 	...
> 
> What am I missing?

The "11+10" and "11+10+3" are to help show where the magic 
numbers come from...that they're column-offsets from the previous 
position...I suppose to have been consistent, I should have used

   client_idx = the_line[10:10+1]

Somewhat like a kludgy version of George Sakkis's more elegant 
version of slicing, but with the advantage of associating names 
with the slice-boundaries.

It would be possible to write it as something like

   for line in file('in.txt'):
     out.write(','.join([
       line[0:10],  # recno
       line[10:11], # client
       line[11:21], # volume
       line[21:24], # order
       line[24:],   # remainder
       ]))

but it's harder to verify that the slicing doesn't incur a 
fence-posting error, and makes it harder to follow if 
manipulations need further checking like

   if line[client_idx] == 'F': continue # skip this client

There are a number of ways to slice & dice the line.  I recommend 
whichever is easiest to read/understand.

-tkc







More information about the Python-list mailing list