[Tutor] Fwd: How to Load Every Revised Wikipedia Page Revision

dbosah dbosah at buffalo.edu
Wed Feb 21 03:36:36 EST 2018





Sent from my T-Mobile 4G LTE Device
-------- Original message --------From: Daniel Bosah <dbosah at buffalo.edu> Date: 2/19/18  3:50 PM  (GMT-05:00) To: tutor at python.org Subject: How to Load Every Revised Wikipedia Page Revision 
Good day,
I'm doing research for a compsci group. I have a script that is supposed to load every revised page of a wikipedia article on FDR.
This script is supposed to, in while loop access the wikipedia api and using the request library, access the apiif the continue is in the requestsupdate the query dict with continuekeep updating until there are no more 'continue' ( or until the API load limit is reached )elsebreak
Here is the code:


def GetRevisions():    url = "https://en.wikipedia.org/w/api.php" #gets the api and sets it to a variable    query = {    "format": "json",    "action": "query",    "titles": "Franklin D. Roosevelt",    "prop": "revisions",    "rvlimit": 500,    }# sets up a dictionary of the arguments of the query 
    while True: # in  a while loop        r = requests.get(url, params = query).json() # does a request call for the url in the parameters of the query        print repr(r) #repr gets the "offical" string output of a object        if 'continue' in r: ## while in the loop, if the keyword is in "r"            query.update(r['continue']) # updates the dictionary to include continue in it, and keeps on printing out all instances of 'continue"        else: # else           break # quit loop


I want to load every page version with the revisions of the wikipedia page, not just the info about the page revision. How can I go about that?

Thanks



More information about the Tutor mailing list