Parse each line by character location

Benjamin Kaplan benjamin.kaplan at case.edu
Tue Nov 4 14:05:55 EST 2008


On Tue, Nov 4, 2008 at 11:45 AM, Tyler <hayes.tyler at gmail.com> wrote:

> Hello All:
>
> I hope this is the right place to ask, but I am trying to come up with
> a way to parse each line of a file. Unfortunately, the file is neither
> comma, nor tab, nor space delimited. Rather, the character locations
> imply what field it is.
>
> For example:
>
> The first ten characters would be the record number, the next
> character is the client type, the next ten characters are a volume,
> and the next three are order type, and the last character would be an
> optional type depending on the order type.
>
> The lines are somewhat more complicated, but they work like that, and
> not all have to be populated, in that they may contain spaces. For
> example, the order number may be 2345, and it is space padded at the
> beginning of the line, and other might be zero padded in the front.
> Imagine I have a line:
>
> ______2345H0000300000_NC_
>
> where the underscores indicate a space. I then want to map this to:
>
> 2345,H,0000300000,NC,
>
> In other words, I want to preserve ALL of the fields, but map to
> something that awk could easily cut up afterwords, or open in a CSV
> editor. I am unsure how to place the commas based on character
> location.
>
> Any ideas?


Everyone seems to be using replace(' ',''). However that will get rid of any
spaces in there, even if it is within the field. You should probably use
strip or lstrip on each field instead. lstrip will only remove leading
spaces, not trailing.


>>> line = '______2345H0000300000_NC_'.replace('_',' ')
>>> print line
      2345H0000300000 NC
>>> print line[0:10]
      2345
>>> print line[0:10].lstrip()
2345
>>> print line[10].lstrip()
H
>>> ",".join((line[:10].lstrip(), line[10].lstrip()))
'2345,H'






> --
> http://mail.python.org/mailman/listinfo/python-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20081104/574f5597/attachment-0001.html>


More information about the Python-list mailing list