Parse each line by character location

Giorgio Gentili george.gentili at alice.it
Tue Nov 4 12:14:55 EST 2008


Tyler ha scritto:
> Hello All:
> 
> I hope this is the right place to ask, but I am trying to come up with
> a way to parse each line of a file. Unfortunately, the file is neither
> comma, nor tab, nor space delimited. Rather, the character locations
> imply what field it is.
> 
> For example:
> 
> The first ten characters would be the record number, the next
> character is the client type, the next ten characters are a volume,
> and the next three are order type, and the last character would be an
> optional type depending on the order type.
> 
> The lines are somewhat more complicated, but they work like that, and
> not all have to be populated, in that they may contain spaces. For
> example, the order number may be 2345, and it is space padded at the
> beginning of the line, and other might be zero padded in the front.
> Imagine I have a line:
> 
> ______2345H0000300000_NC_
> 
> where the underscores indicate a space. I then want to map this to:
> 
> 2345,H,0000300000,NC,
> 
> In other words, I want to preserve ALL of the fields, but map to
> something that awk could easily cut up afterwords, or open in a CSV
> editor. I am unsure how to place the commas based on character
> location.
> 
> Any ideas?

A solution can be
line_new = (line[0:10] +  ',' + line[10] + ',' + line[11:21] + ',' + 
line[22:25]+',').replace(' ','')




More information about the Python-list mailing list