Printing a drop down menu for a specific field.

Nick the Gr33k nikos.gr33k at gmail.com
Sun Oct 27 03:31:07 EDT 2013


Στις 27/10/2013 6:00 πμ, ο/η rurpy at yahoo.com έγραψε:
> On 10/26/2013 06:11 PM, Nick the Gr33k wrote:
>> Στις 27/10/2013 2:52 πμ, ο/η Nick the Gr33k έγραψε:
>>> Ah foun it had to change in you code this line:
>>>               key = host, city, useros, browser, ref
>>>
>>> to this line:
>>>
>>>               key = host, city, useros, browser
>>>
>>> so 'ref' wouldnt be calculated in the unique combination key.
>>>
>>> I'am still trying to understand the logic of your code and trying to
>>> create a history list column for the 'referrers'
>>>
>>> I dont know how to write it though to produce the sam
>>
>> Iam trying.
>>
>> Ah foun it had to change in you code this line:
>>               key = host, city, useros, browser, ref
>>
>> to this line:
>>
>>               key = host, city, useros, browser
>>
>> so 'ref' wouldnt be calculated in the unique combination key.
>>
>> I'am still trying to understand the logic of your code and trying to
>> create a history list column for the 'referrers'
>>
>> I dont know how to write it though to produce the same output for referrers.
>>
>> The bast i came up with is:
>>
>> [code]
>> def coalesce( data ):
>> 		newdata = []
>> 		seen = {}
>> 		for host, city, useros, browser, ref, hits, visit in data:
>> 			# Here i have to decide how to group the rows together.
>> 			# I want an html row for every unique combination of (host, city,
>> useros, browser) and that hits should be summed together.
>> 			key = host, city, useros, browser
>> 			if key not in seen:
>> 				newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>> 				seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this
>> row.
>> 			else:		# This row is a duplicate row with a different visit time.
>> 				rowindex = seen[key]
>> 				newdata[rowindex][4].append( ref )
>> 				newdata[rowindex][5] += hits
>> 				newdata[rowindex][6].append( visit )
>> 		return newdata
>>
>> 		
>> 	cur.execute( '''SELECT host, city, useros, browser, ref, hits,
>> lastvisit FROM visitors
>> 					WHERE counterID = (SELECT ID FROM counters WHERE url = %s) ORDER BY
>> lastvisit DESC''', page )
>> 	data = cur.fetchall()
>>
>> 	
>> 	newdata = coalesce( data )
>> 	for row in newdata:
>> 		(host, city, useros, browser, refs, hits, visits) = row
>> 		# Note that 'ref' & 'visits' are now lists of visit times.
>> 		
>> 		print( "<tr>" )
>> 		for item in (host, city, useros, browser):
>> 			print( "<td><center><b><font color=white> %s </td>" % item )
>> 			
>> 		print( "<td><select>" )
>> 		for n, ref in enumerate( refs ):
>> 			if n == 0:
>> 				op_selected = 'selected="selected"'
>> 			else:
>> 				op_selected = ''
>> 		print( "<option %s>%s</option>" % (op_selected, ref) )
>> 		print( "</select></td>" )
>>
>> 		for item in (hits):
>> 			print( "<td><center><b><font color=white> %s </td>" % item )
>> 			
>> 		print( "<td><select>" )
>> 		for n, visit in enumerate( visits ):
>> 			visittime = visit.strftime('%A %e %b, %H:%M')
>> 			if n == 0:
>> 				op_selected = 'selected="selected"'
>> 			else:
>> 				op_selected = ''
>> 			print( "<option %s>%s</option>" % (op_selected, visittime) )
>> 		print( "</select></td>" )
>> 		
>> 		print( "</tr>" )
>> [/code]
>>
>> But this doesnt work correctly for refs and also doenst not print for
>> some reason the hits and visit colums.
>
> Without a traceback it is hard to figure out what is happening.
> (Actually in this case there is one obvious error, but there are
> also some unobvious ones.)
>
> Here is what I did to find the problems, and what you can do
> the next time.  The main thing was to extract the code from
> the cgi script so that I could run it outside of the web server
> and without needing access to the database.  Then you can add
> print statements (or run with the pdb debugger) and see tracebacks
> and other errors easily.
>
> 1. Copy and paste the code from your message into a .py file.
> 2. Put a "def main(): line at the top of your main code.
> 3. Add a line at the bottom to call main()
> 4. Copy and paste a part of the web page you gave that list all the visits.
> 5. Edit it (change TAB to "|", add comma's after each line, etc, to
>   create a statement that will create variable, DATA.
> 6. Add a few statements to turn DATA into variable 'data' which has a
>   format similar to the format returned by your cur.fetchall() call.
>
> This all took just 10 or 15 minutes, and I ended up with the
> following code:
> --------------
>      def main():
>          data = [ln.split('|') for ln in DATA]
>          for r in data: r[5] = int(r[5])  # Change the 'hit' values from str to int.
>          print ('<table border="1">')
>
>          newdata = coalesce( data )
>          for row in newdata:
>                  (host, city, useros, browser, refs, hits, visits) = row
>                  # Note that 'ref' & 'visits' are now lists of visit times.
>
>                  print( "<tr>" )
>                  for item in (host, city, useros, browser):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                  print( "<option %s>%s</option>" % (op_selected, ref) )
>                  print( "</select></td>" )
>
>                  for item in (hits):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, visit in enumerate( visits ):
>                          visittime = visit.strftime('%A %e %b, %H:%M')
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, visittime) )
>                  print( "</select></td>" )
>                  print( "</tr>" )
>
>      def coalesce( data ):
>                  newdata = []
>                  seen = {}
>                  for host, city, useros, browser, ref, hits, visit in data:
>                          # Here i have to decide how to group the rows together.
>                          # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
>                          key = host, city, useros, browser
>                          if key not in seen:
>                                  newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>                                  seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
>                          else:		# This row is a duplicate row with a different visit time.
>                                  rowindex = seen[key]
>                                  newdata[rowindex][4].append( ref )
>                                  newdata[rowindex][5] += hits
>                                  newdata[rowindex][6].append( visit )
>                  return newdata
>
>      DATA = [
>      '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '89-145-108-206.as29017.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      ]
>
>      main()
> -------------
>
> I then ran it and it reported:
>      [...some html output...]
>      Traceback (most recent call last):
>        File "xx2.py", line 88, in <module>
>          if __name__ == '__main__': main()
>        File "xx2.py", line 26, in main
>          for item in (hits):
>      TypeError: 'int' object is not iterable
>
> Line 88 is:
>
>          for item in (hits):
>
> Remember that 'hits' is just an integer number, not a list.
> So I changed:
>
>          for item in (hits):
>                  print( "<td><center><b><font color=white> %s </td>" % item )
>
> to:
>
>          print( "<td><center><b><font color=white> %s </td>" % hits )
>
> and ran again.  This time:
>
>      Traceback (most recent call last):
>        File "xx2.py", line 87, in <module>
>          if __name__ == '__main__': main()
>        File "xx2.py", line 30, in main
>          visittime = visit.strftime('%A %e %b, %H:%M')
>      AttributeError: 'str' object has no attribute 'strftime'
>
> This is not a problem with the program but with the input data.  When
> you get data from your database, 'visit' is a datetime object.  But
> when it comes from the synthetic data we created, it is an already-
> formatted string.
>
> So I replaced
> :
>      visittime = visit.strftime('%A %e %b, %H:%M')
>
> with
>
>      visittime = visit   #.strftime('%A %e %b, %H:%M')
>
> I also realized you use white fonts so I changed the table to have
> a blue background color:
>
>      print ('<table border="1">')
>
> to
>
>      print ('<table border="1" bgcolor="blue">')
>
> Now when I run the program it runs without errors and produces html
> output.  So run again but save the output to a file.
>
>      $ python3 test.py > test.html
>
> And open the file with a browser.  Looks ok but none of the 'ref'
> buttons has more than one entry.  So I edited the data to be:
>
>      DATA = [
>      '209.133.77.165.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:59',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:49',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://superhost.gr/|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:48',
>      'mail0.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      '209.133.77.164.T01713-01.above.net|Άγνωστη Πόλη|Windows|Explorer|Direct Hit|1|Σάββατο 26 Οκτ, 18:47',
>      'mail14.ess.barracuda.com|Άγνωστη Πόλη|Windows|Explorer|http://mythosweb.gr/|1|Σάββατο 26 Οκτ, 18:22',
>      ]
>
> And ran again.  But still, the 'ref' button for mail14.ess.barracuda.com
> showed on one item in the dropdown list.  After looking at the generated
> source code and adding some print statement to the python code I realized
> that you had:
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                  print( "<option %s>%s</option>" % (op_selected, ref) )
>
> but what you want is:
>
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, ref) )
>
> I also realized after looking at the HTML spec for the OPTION element
>    (http://www.w3.org/TR/html401/interact/forms.html#h-17.6)
> that 'selected="selected"' should be just 'selected'.  So in
> both places they occur, I changed
>
>                          if n == 0:
>                                  op_selected = 'selected="selected"'
> to
>
>                          if n == 0:
>                                  op_selected = 'selected'
>
> Now the generated page looks right and the only thing left to do
> in to copy the fixed code back into your main cgi script, remember
> to undo the temp change made for testing:
>
>      visittime = visit   #.strftime('%A %e %b, %H:%M')
>
> back to
>
>      visittime = visit.strftime('%A %e %b, %H:%M')
>
>
> So here is the fixed code:
> ----------------
>          newdata = coalesce( data )
>          for row in newdata:
>                  (host, city, useros, browser, refs, hits, visits) = row
>                  # Note that 'ref' & 'visits' are now lists of visit times.
>
>                  print( "<tr>" )
>                  for item in (host, city, useros, browser):
>                          print( "<td><center><b><font color=white> %s </td>" % item )
>
>                  print( "<td><select>" )
>                  for n, ref in enumerate( refs ):
>                          if n == 0:
>                                  op_selected = 'selected'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, ref) )
>                  print( "</select></td>" )
>
>                  print( "<td><center><b><font color=white> %s </td>" % hits )
>
>                  print( "<td><select>" )
>                  for n, visit in enumerate( visits ):
>                          visittime = visit.strftime('%A %e %b, %H:%M')
>                          if n == 0:
>                                  op_selected = 'selected'
>                          else:
>                                  op_selected = ''
>                          print( "<option %s>%s</option>" % (op_selected, visittime) )
>                  print( "</select></td>" )
>                  print( "</tr>" )
>
>          def coalesce( data ):
>                  newdata = []
>                  seen = {}
>                  for host, city, useros, browser, ref, hits, visit in data:
>                          # Here i have to decide how to group the rows together.
>                          # I want an html row for every unique combination of (host, city, useros, browser) and that hits should be summed together.
>                          key = host, city, useros, browser
>                          if key not in seen:
>                                  newdata.append( [host, city, useros, browser, [ref], hits, [visit]] )
>                                  seen[key] = len( newdata ) - 1		# Save index (for 'newdata') of this row.
>                          else:		# This row is a duplicate row with a different visit time.
>                                  rowindex = seen[key]
>                                  newdata[rowindex][4].append( ref )
>                                  newdata[rowindex][5] += hits
>                                  newdata[rowindex][6].append( visit )
>                  return newdata
> ----------------


Once again i personally thank you very much for your help and specially 
for taking the time and effort to explain to me in detail the logic you 
followed to just wanted to  make the code work.

I read it thoroughly and tested it and it works as it should.

I just wanted to mention that the definition of the function coalesce() 
must come prior of:

>          newdata = coalesce( data )
>          for row in newdata:

because function must be defined first before we try to call it and pass 
data to ti, so i placed it just before that.

Also i have changed the data insertion to be:

# if first time visitor on this page, create new record, if visitor 
exists then update record
		cur.execute('''INSERT INTO visitors (counterID, host, city, useros, 
browser, ref, lastvisit) VALUES (%s, %s, %s, %s, %s, %s, %s)''',
					   (cID, host, city, useros, browser, ref, lastvisit) )

removing the 'ON DUPLICATE UPDATE' i had and also removed the unique 
index(CounterID, host)

so that every time a visitor comes into my website even with the same 
hostname a new database entry will appear for the same hostname.

I almost understand your code, but this part is not so clear to me:
f key not in seen:

seen[key] = len( newdata ) - 1 # Save index (for 'newdata') of this row.
else:    # This row is a duplicate row with a different referrer & visit 
time.
     rowindex = seen[key]
     newdata[rowindex][4].append( ref )
     newdata[rowindex][5] += hits
     newdata[rowindex][6].append( visit )


I couldn't at all be successfull in writing this myself even.







More information about the Python-list mailing list