Scraping multiple web pages help

Wed Feb 27 00:28:13 EST 2019

You need to obtain a key for API first - from
https://regulationsgov.github.io/developers/

The Regulations.gov API is taking action to conserve system resources.
Beginning immediately, we will limit
access to one account per organization, and require approval for enabling
accounts.* Please contact the *
*Regulations.gov Help Desk, if you would like to request an API key. *
Please provide your name, email  address, organization, and intended use of
the API.

Send email to:
regulations at erulemakinghelpdesk.com

Good luck,

Phu

On Mon, Feb 18, 2019 at 10:19 AM Drake Gossi <drake.gossi at gmail.com> wrote:

> Hello everyone,
>
> For a research project, I need to scrape a lot of comments from
> regulations.gov
>
>
> https://www.regulations.gov/docketBrowser?rpp=25&so=DESC&sb=commentDueDate&po=0&dct=PS&D=ED-2018-OCR-0064
>
> But partly what's throwing me is the url addresses of the comments. They
> aren't consistent. I mean, there's some consistency insofar as the numbers
> that differentiate the pages all begin after that 0064 number in the url
> listed above. But the differnetiating numbers aren't even all the same
> amount of numbers. Some are 4 (say, 4019) whereas others are 5 (say,
> 50343). But I dont think they go over 5. So this is a problem. I dont know
> how to write the code to access the multiple pages.
>
> I should also mention I'm new to programing, so that's also a problem (if
> you cant already tell by the way I'm describing my problem).
>
>
> I should also mention that, I think, there's an API on regulations.gov,
> but
> I'm such a beginner that I dont evem really know where to find it, or even
> what to do with it once I do. That's how helpless am right now.
>
> Any help anyone could offer would be much appreciated.
>
> D
> --
> https://mail.python.org/mailman/listinfo/python-list
>