trying to begin a code for web scraping

Drake Gossi drake.gossi at gmail.com
Tue Feb 19 00:50:57 EST 2019


Hi everyone,

I'm trying to write code to scrape this website
<https://www.regulations.gov/document?D=ED-2018-OCR-0064-5403> (
regulations.gov) of its comments, but I'm having trouble figuring out what
to link onto in the inspect page (like when I right click on inspect with
the mouse).

Although I need to write code to scrape all 11,000ish of the comments
related to this event (by putting a code in a loop?), I'm still at the
stage of looking at individual comments. So, for example, with this comment
<https://www.regulations.gov/document?D=ED-2018-OCR-0064-5403>, I know
enough to right click on inspect and to look at the xml? (This is how much
of a beginner I am--what am I looking at when I right click inspect?) Then,
I control F to find where the comment is in the code. For that comment, the
word I used control F on was "troubling." So, I found the comment buried in
the xml

But my issue is this. I don't know what to link onto to scrape the comment
(and I assume that this same sequence of letters would apply to scraping
all of the comments in general). I assume what I grab is GIY1LSJISD. I'm
watching this video, and the person is linking onto "tr" and "td," but mine
is not that easy. In other words, what is the most essential language (bit
of xml? code), the copying of which would allow me to extract not only this
comment, but all of the comments, were I to put this bit of language(/xml?)
my code? ... ... soup.findALL ('?')

In sum, what I need to know is, how do I tell my Python code to ignore all
of the surrounding code and go straight in and grab the comment. Of course,
I need to grab other things too like the name, category, date, and so on,
but I haven't gotten that far yet. Right now, I'm just trying to figure out
what I need to insert into my code so that I can get the comment.

Help! I'm trying to learn code on the fly. I'm an experienced researcher
but am new to coding. Any help you could give me would be tremendously
awesome.

Best,
Drake



More information about the Python-list mailing list