[Tutor] unpack/regexp

Tue Apr 11 17:08:44 CEST 2006

Paul Kraus wrote:
> Ok sorry for the perl refernce but I can't figure out how to do this.
> I have a fixed width text file i need to parse.
> 
> so lets say I want an array to containt the pieces i need.
> if the fields I want are lengths from left to right.
> 10 10 13
> 12345678901234567890123456789012
> I want to turn this into an array that has these elements.
> 1234567890
> 1234567890
> 123456789012 <--notice white space
> 
> In Perl its a simple
> my @array = unpack ( "A10 A10 A13" , $line )
> this extracts it and removes the whitespace after doing so.

struct.unpack() is a direct analog:

In [10]: line = "12345678901234567890123456789012 "

In [16]: struct.unpack('10s10s13s', line)
Out[16]: ('1234567890', '1234567890', '123456789012 ')

You can also use string slicing:

In [14]: line[:10], line[10:20], line[20:]
Out[14]: ('1234567890', '1234567890', '123456789012 ')

> 
> or if i wanted i could do
> my @array = ( $1, $2, $3 ) if ( $line =~ m/^(.{10})(.{10})(.{13}) )

Python regex is a bit more verbose than Perl but you can do the same thing:

In [2]: import re

In [11]: m=re.match("(.{10})(.{10})(.{13})", line)

In [13]: m.group(1, 2, 3)
Out[13]: ('1234567890', '1234567890', '123456789012 ')

Kent