newbie - again ! iterating through two nested lists?

Joal Heagney s713221 at student.gu.edu.au
Sat Sep 1 22:04:25 EDT 2001


eif wrote:
> 
> thanks for the help before guys - what im trying to do now is iterate
> through two nested lists - im trying to compare the third element of each
> list and if they are equal then I want to append this element plus some of
> the other element s into a new list this is what ive got so far:
> 
> #!/usr/local/bin/python
> 
> import string
> listedRecs1 = [rec.split('|') for rec in open('lecturer.txt').xreadlines()]
> listedRecs2 = [rec.split('|') for rec in open('exam.txt').xreadlines()]
> intersection = []
> disjunction = []
> x=0
> z=0
> 
> for line in listedRecs1[x]:
>     if line in listedRecs2[z]:
>         intersection.append(line)
>     else:
>         disjunction.append(line)
>     x= x+1
> 
> print intersection
> 
> the output is: ['', '016354']
> so its comparing the first in each - but can anyone help me to do the
> others?
> plese help its the early hours and my head is hurting
> 
> these are the two lists:
> 
> listedRecs1
> [['999347857', '', '9/1/2001 at 13:37:37', '016354', 'jim o shea',
> 'jim at mmu.ac.uk', '126', '0161025658\n'], ['999347943', '', '9/1/2001 at
> 13:39:03', '223654', 'roy saberton', 'roy at mmu.ac.uk', '201',
> '0161236510\n'], ['999347943', '', '9/1/2001 at 13:39:03', '223654', 'roy
> saberton', 'roy at mmu.ac.uk', '201', '0161236510\n']]
> 
> listedRecs2
> [['999369265', '', '9/1/2001 at 19:34:25', '016354', 'maths', '2.00',
> '30.6.01', '9.00am\n'], ['999369338', '', '9/1/2001 at 19:35:38', '223654',
> 'english', '1.30', '31.06.01', '1.30\n'], ['999369367', '', '9/1/2001 at
> 19:36:07', '568796', 'welsh', '3.00', '01.07.01', '1.30\n']]

Not really sure what you're trying to do here. If you're trying to scan
through the lists in listedRecs1 and compare the third element of these
lists with listedRecs2 to see if there's a match, the following does
that

>>> for line in listedRecs1:
	notfound = 1
	for line2 in listedRecs2:
		if line[3] == line2[3]:
			intersection.append(line)
			notfound = 0
	if notfound:
		disjunction.append(line)

But with your supplied data, 

>>> intersection
[['999347857', '', '9/1/2001 at 13:37:37', '016354', 'jim o shea',
'jim at mmu.ac.uk', '126', '0161025658\n'], ['999347943', '', '9/1/2001
at13:39:03', '223654', 'roy saberton', 'roy at mmu.ac.uk', '201',
'0161236510\n'], ['999347943', '', '9/1/2001 at 13:39:03', '223654',
'roysaberton', 'roy at mmu.ac.uk', '201', '0161236510\n']]
>>> disjunction
[]
>>> for i in listedRecs1:
	print i[3]
	
016354
223654
223654
>>> for i in listedRecs2:
	print i[3]

016354
223654
568796

So for each list in listedRecs1, there is another list in listedRecs2
which has a matching third element, i.e. no disjunctions.

If you only want to add the third element to the intersection, you can
alter the above code as follows.

>>> for line in listedRecs1:
	notfound = 1
	for line2 in listedRecs2:
		if line[3] == line2[3]:
			intersection.append(line[3])
			notfound = 0
	if notfound:
		disjunction.append(line[3])

		
>>> intersection
['016354', '223654', '223654']
>>> disjunction
[]

If you want to add, say the name and the email field to your list, you
can build a list and append this instead

>>> for line in listedRecs1:
	notfound = 1
	for line2 in listedRecs2:
		if line[3] == line2[3]:
			intersection.append([line[3],line[4],line[5]])
			notfound = 0
	if notfound:
		disjunction.append([line[3],line[4],line[5]])

		
>>> intersection
[['016354', 'jim o shea', 'jim at mmu.ac.uk'], ['223654', 'roy saberton',
'roy at mmu.ac.uk'], ['223654', 'roysaberton', 'roy at mmu.ac.uk']]
>>> disjunction
[]

Or with slices:
>>> for line in listedRecs1:
	notfound = 1
	for line2 in listedRecs2:
		if line[3] == line2[3]:
			intersection.append(line[3:6])
			notfound = 0
	if notfound:
		disjunction.append(line[3:6])

		
>>> intersection
[['016354', 'jim o shea', 'jim at mmu.ac.uk'], ['223654', 'roy saberton',
'roy at mmu.ac.uk'], ['223654', 'roysaberton', 'roy at mmu.ac.uk']]
>>> disjunction
[]

Now the problem is that for certain combinations of fields, we could end
up with duplicate results in intersection or disjunction, so you may
want to do some checking before adding to these lists. For this example,
I prebuild a result list and store it in the variable result, and then
check to see if it's in the intersection/disjunction lists already.

>>> for line in listedRecs1:
	notfound = 1
	for line2 in listedRecs2:
		if line[3] == line2[3]:
			result = [line[3],line[5]]
			if result not in intersection:
				intersection.append(result)
			notfound = 0
	if notfound:
		result = [line[3],line[5]]
		if result not in disjunction:
			disjunction.append(result)

			
>>> intersection
[['016354', 'jim at mmu.ac.uk'], ['223654', 'roy at mmu.ac.uk']]
>>> disjunction
[]

Now as a final comment, you have some data in your lists where
everything matches up except for name fields, eg. 'roy saberton',
'roysaberton'. You may want to design a list cleaner/builder that
ignores duplicate copies that differ only in case and whitespace.
The following removes white space from a string str, and returns a new
string
"".join(str.split(" "))
And then get rid of case-sensitivity by converting to lower/upper case
using the string methods lower/upper.
"".join(str.split(" ")).lower()

*chuckles* At this stage, I'd be seriously thinking about using
dictionaries.

Hope that was of some help?
-- 
      Joal Heagney is: _____           _____
   /\ _     __   __ _    |     | _  ___  |
  /__\|\  ||   ||__ |\  || |___|/_\|___] |
 /    \ \_||__ ||___| \_|! |   |   \   \ !



More information about the Python-list mailing list