Breaking String into Values

Steven Majewski sdm7g at Virginia.EDU
Fri Mar 29 19:09:27 EST 2002


On Fri, 29 Mar 2002, robert garretson wright wrote:

> I am working on reading in a data file format which is set up as a series
> of lines that look like this:
>
> 3500035000010104A Foo 45
>
> I want to break up into a variables as follows:
> a = 35000, b = 35000, c = 10104, d = 'A', e = 'Foo', f = 45
>
> My current code (auto-generated from a data dictionary) looks
> something like this:
>
> temp = line[0:5]
> a = int(temp)
> temp = line[5:10]
> b = int(temp)
> temp = line[10:15]
> c = int(temp)
> temp = line[15:16]
> d = temp
> temp = line[16:20]
> temp = temp.rstrip()
> e = temp
> temp = line[20:23]
> f = int(temp)
>
> with a bit more error checking around the int() calls.
>
> Is there a better way to do this? The files have around 1000-8000 lines each
> so I would like it to be fast. Is there a package around that someone has
> coded up as a C-extension to do this?
>

"better" ?

I don't know if it's better, but you can do it more concisely with
something like this:

>>> a,b,c,d,e,f = [ f(s[i:j]) for i,j,f in ((0,5,int),
	(5,10,int),(10,15,int),(15,16,str),(16,20,str),(20,23,int))]


( in 2.2 -- earlier versions can do the same in a loop or a map() )

... probably a bit slower, but more readable I think.


Depending on what you're going to do with them, sometimes it makes
sense to load them as attributes of a class. I that case you can
add the variable names to the list and do a setattr():


class Thing:
	pass
athing = Thing()

for name,i,j,f in (('a',0,5,int),('b',5,10,int),('c',10,15,int),
			('d',15,16,str),('e',16,20,str),('f',20,23,int)):
	setattr(athing,name,f(s[i:j])



Sometimes, with data file conversion problems, I will make an intermediate
class to represent the abstract object, with methods to load from or dump
to the particular representation format.


Again -- it's more overhead to do it that way, but I've usually found
it worth it in the long run.

-- Steve Majewski






More information about the Python-list mailing list