Stripping non-numbers from a file parse without nested lists?
andrew cooke
andrew at acooke.org
Tue Mar 31 21:58:07 EDT 2009
Rhodri James wrote:
> On Tue, 31 Mar 2009 06:51:33 +0100, <daku9999 at gmail.com> wrote:
>
>> There has got to be a better way of doing this:
>>
>> I'm reading in a file that has a lot of garbage, but eventually has
>> something that looks similar to:
>> (some lines of garbage)
>> dip/dir.
>> (some more lines of garbage)
>> 55/158
>> (some more lines of garbage)
>> 33/156
>> etc.
>>
>> and I'm stripping out the 55/158 values (with error checking
>> removed):
>> ------
>> def read_data(filename):
>> fh = open(filename, "r", encoding="ascii")
>>
>> for line in fh:
>> for word in line.lower().split():
>> if "/" in word and "dip" not in word:
>> temp = word.partition("/")
>> dip.append(temp[0])
>> dir.append(temp[2])
>> -----
>>
>> I can't figure out a nicer way of doing it without turning the thing
>> into a nested list (non-ideal). I could put the entire tuple inside
>> of a list, but that gets ugly with retrieval. I'm sure there is an
>> easier way to store this. I was having trouble with dictionary's due
>> to non-uniquie keys when I tried that route.
>>
>> Any ideas for a better way to store it? This ultimately parses a
>> giant amount of data (ascii dxf's) and spits the information into a
>> csv, and I find the writing of nested lists cumbersome and I'm sure
>> I'm missing something as I'm quite new to Python.
i don't follow exactly what the problem is, but the mention of nested
lists makes me think maybe you need a generator. you can define this
function:
def tokens(filename):
with open(filename, "r", encoding="ascii") as fh:
for line in fh:
for word in line.lower().split():
if "/" in word and "dip" not in word:
temp = word.partition("/")
yield(temp[0], temp[2])
and then elsewhere do:
for (val1, val2) in tokens(filename):
.... stuff here ...
which is a very common pattern for avoiding constructing lists of things
that you want to use elsewhere.
andrew
More information about the Python-list
mailing list