[Tutor] Fwd: How to Load Every Revised Wikipedia Page Revision
dbosah
dbosah at buffalo.edu
Wed Feb 21 03:36:36 EST 2018
Sent from my T-Mobile 4G LTE Device
-------- Original message --------From: Daniel Bosah <dbosah at buffalo.edu> Date: 2/19/18 3:50 PM (GMT-05:00) To: tutor at python.org Subject: How to Load Every Revised Wikipedia Page Revision
Good day,
I'm doing research for a compsci group. I have a script that is supposed to load every revised page of a wikipedia article on FDR.
This script is supposed to, in while loop access the wikipedia api and using the request library, access the apiif the continue is in the requestsupdate the query dict with continuekeep updating until there are no more 'continue' ( or until the API load limit is reached )elsebreak
Here is the code:
def GetRevisions(): url = "https://en.wikipedia.org/w/api.php" #gets the api and sets it to a variable query = { "format": "json", "action": "query", "titles": "Franklin D. Roosevelt", "prop": "revisions", "rvlimit": 500, }# sets up a dictionary of the arguments of the query
while True: # in a while loop r = requests.get(url, params = query).json() # does a request call for the url in the parameters of the query print repr(r) #repr gets the "offical" string output of a object if 'continue' in r: ## while in the loop, if the keyword is in "r" query.update(r['continue']) # updates the dictionary to include continue in it, and keeps on printing out all instances of 'continue" else: # else break # quit loop
I want to load every page version with the revisions of the wikipedia page, not just the info about the page revision. How can I go about that?
Thanks
More information about the Tutor
mailing list