Stripping non-numbers from a file parse without nested lists?

Rhodri James rhodri at wildebst.demon.co.uk
Tue Mar 31 21:47:04 EDT 2009


On Tue, 31 Mar 2009 06:51:33 +0100, <daku9999 at gmail.com> wrote:

> There has got to be a better way of doing this:
>
> I'm reading in a file that has a lot of garbage, but eventually has
> something that looks similar to:
> (some lines of garbage)
> dip/dir.
> (some more lines of garbage)
> 55/158
> (some more lines of garbage)
> 33/156
> etc.
>
> and I'm stripping out the 55/158 values (with error checking
> removed):
> ------
> def read_data(filename):
>        fh = open(filename, "r", encoding="ascii")
>
>        for line in fh:
>            for word in line.lower().split():
>                if "/" in word and "dip" not in word:
>                    temp = word.partition("/")
>                    dip.append(temp[0])
>                    dir.append(temp[2])
> -----
>
> I can't figure out a nicer way of doing it without turning the thing
> into a nested list (non-ideal).  I could put the entire tuple inside
> of a list, but that gets ugly with retrieval.  I'm sure there is an
> easier way to store this.  I was having trouble with dictionary's due
> to non-uniquie keys when I tried that route.
>
> Any ideas for a better way to store it?  This ultimately parses a
> giant amount of data (ascii dxf's) and spits the information into a
> csv, and I find the writing of nested lists cumbersome and I'm sure
> I'm missing something as I'm quite new to Python.

What you're doing (pace error checking) seems fine for the data
structures that you're using.  I'm not entirely clear what your usage
pattern for "dip" and "dir" is once you've got them, so I can't say
whether there's a more appropriate shape for them.  I am a bit curious
though as to why a nested list is non-ideal?

...
     if "/" in word and "dip" not in word:
	dip_n_dir.append(word.split("/", 1))

is marginally shorter, and has the virtue of making it harder to use
unrelated dip and dir values together.

-- 
Rhodri James *-* Wildebeeste Herder to the Masses



More information about the Python-list mailing list