help on Implementing a list of dicts with no data pattern

Dave Angel davea at davea.name
Thu May 9 08:26:03 EDT 2013


On 05/09/2013 05:57 AM, rlelis wrote:
> On Thursday, May 9, 2013 12:47:47 AM UTC+1, rlelis wrote:
>> Hi guys,
>>
>>
>>
>> I'm working on this long file, where i have to keep reading and
>>
>> storing different excerpts of text (data) in different variables (list).
>>
>>
>>
>> Once done that i want to store in dicts the data i got from the lists mentioned before. I want them on a list of dicts for later RDBMs purpose's.
>>
>>
>>
>> The data i'm working with, don't have fixed pattern (see example bellow), so what i'm doing is for each row, i want to store combinations of  word/value (Key-value) to keep track of all the data.
>>
>>
>>
>> My problem is that once i'm iterating over the list (original one a.k.a file_content in the link), then i'm nesting several if clause to match
>>
>> the keys i want. Done that i select the keys i want to give them values and lastly i append that dict into a new list. The problem here is that i end up always with the last line repeated several times for each row it found's.
>>
>>
>>
>> Please take a look on what i have now:
>>
>> http://pastebin.com/A9eka7p9
>
> Sorry, i thought that a link to pastebin could be helpfully since it captures the syntax highlights and spacings. I don't have a fifty line code there. The 25 lines below, where to show you guys a picture of what is going on, to be more intuitive.
> This is what i have for now:
>

The entire following set of comments is probably outdated since you 
apparently did NOT use readlines() or equivalent to get file_content. 
So you'd better give us some sample data, a program that can actually 
run without getting exceptions due to misnamed variables, and a 
description of just what you expected to be in each result variable.

It'd also be smart to mention what version of Python you're targeting.

.... what follows was a waste of my time ...

file_content is not defined, but we can guess you have read it from a 
text file with readlines(), or more efficiently that it's simply a file 
object for a file opened with "r".  Can we see sample data, maybe for 3 
or four lines?

file_content = [
     "A4 value2 aging",
     "b8 value99 paging",
     "-1 this is aging a test",
     "B2  repeaagingts",
     ]

The sample, or the description, should indicate if repeats of the 
"columns" column are allowed, as with b and B above.

> highway_dict = {}
> aging_dict = {}
> queue_row = []
> for content in file_content:
> 	if 'aging' in content:
> 		# aging 0 100
> 		collumns = ''.join(map(str, content[:1])).replace('-','_').lower()
> 		total_values =''.join(map(str, content[1:2]))
> 		aging_values = ''.join(map(str, content[2:]))

Those three lines would be much more reasonable and readable if you 
eliminated all the list stuff, and just did what was needed.  Also, 
calling a one-character string "collumns" or "total_values" makes no 
sense to me.

                 collumns = content[:1].replace('-','_').lower()
   		total_values = content[1:2]
  		aging_values = content[2:]

>
> 		aging_dict['total'], aging_dict[collumns] = total, aging_values

That line tries to get clever, and ends up obscuring what's really 
happening.  Further, the value in total, if any is NOT what you just 
extracted in total_values.
  		aging_dict['total'] = total
  		aging_dict[collumns] = aging_values


> 	queue_row.append(aging_dict)

Just what do you expect to be in the aging_dict here?  If you intended 
that each item of queue_row contains a dict with just one item, then you 
need to clear aging_dict each time through the loop.  As it stands the 
list ends up with a bunch of dicts, each with possibly one more entry 
than the previous dict.

All the same remarks apply to the following code.  Additionally, you 
don't use collumns for anything, and you use lanes and state when you 
presumably meant lanes_values and state_values.
>
> 	if 'highway' in content:


> 		#highway	|	4 	|	disable |	25
> 		collumns = ''.join(map(str, content[:1])).replace('-','_').lower()
> 		lanes_values  =''.join(map(str, content[1:2]))		
> 		state_values = ''.join(map(str, content[2:3])).strip('')
> 		limit_values = ''.join(map(str, content[3:4])).strip('')
> 		
> 		highway_dict['lanes'], highway_dict['state'], highway_dict['limit(mph)'] = lanes, state, limit_values
> 	queue_row.append(highway_dict)
>

Now, when

-- 
DaveA



More information about the Python-list mailing list