Error in webscraping problem

stanleydasilva93 at gmail.com stanleydasilva93 at gmail.com
Thu Nov 17 06:55:30 EST 2016


I am trying to solve the following problem.  Two numbers appear on a website. The user has to enter the gcd (greatest common divisor) and hit the 
submit button.  The catch is that the time limit is far too slow for
any human processes -- it must be fully automated. The numbers change
each time the user attempts the problem.

Unfortunately, I can't release the name of the website because of 
corporate confidentiality but I'm hoping someone may have some clues
as to what I'm doing wrong.  The code is below.  FinalResults.html
gives me a "Wrong Answer" message.

As a check, I kept a record of the results for one pair of numbers.
I called the webpage corresponding to a fixed pair "testRequest.txt"
The code:  

 myfile = open("testRequest.txt" , "r")
 page = myfile.read()

is used to check that the first number, second number, and solution, are
all as they should be.

I would be very grateful for any help or advice on the next steps.
Part of the problem is in debugging.

Thanks,
Stanley.


import sys
sys.path.append('C:/Users/silviadaniel/Anaconda3/Lib/site-packages')
import mechanize
import requests
import fractions
url = "http://someWebsite.com/test"
br = mechanize.Browser()
br.open(url)
br.select_form(nr = 0) 
# There is only one form involved so  this is probably ok

print br.form
# reads <POST http://website.com/submit application/x-www-form-urlencoded
# <TextControl(divisor=)>
# <SubmitButtonControl(<None>=) (readonly)>>

data = requests.get(url)
page = data.text
begTag = "<strong>"
endTag = "</strong>"
firstIndex = page.find(begTag)
secondIndex = page.find(endTag)
shift = len(begTag)
posNumber = firstIndex + shift
firstNumber = int(page[posNumber:secondIndex])
firstIndex = page.find(begTag, firstIndex + 1)
secondIndex = page.find(endTag, secondIndex + 1)
posNumber = firstIndex + shift
secondNumber = int(page[posNumber:secondIndex])
solution = str(fractions.gcd(firstNumber, secondNumber))
br["divisor"] = solution
print firstNumber # Looks sensible -- first number probably correct
print secondNumber # Looks sensible -- also checked
print solution # Is indeed the correct gcd
res = br.submit()

content = res.read()
with open("FinalResults.html", "w") as f:
    f.write(content)
# Unfortunately, I examine this file to find "Wrong Answer"



More information about the Python-list mailing list