BeautifulSoup help !!

Chris Angelico rosuav at gmail.com
Thu Oct 6 13:23:45 EDT 2016


On Fri, Oct 7, 2016 at 4:00 AM, Navneet Siddhant
<desolate.soul.me at gmail.com> wrote:
> I guess I shouldnt have mentioned as this was a recruitment task. If needed I can post a screenshot of the mail I got which says I can take help from anywhere possible as long as the assignment is done. Wont be simply copying pasting the code as question related to the same will be asked in the interview.
> I just need a proper understanding as to what I need to do to get the results.
> Also how to export the result to csv format.

A screenshot isn't necessary - we trust you to not flat-out lie to us.
(And if you did, well, this is a public list, so your deception would
burn you pretty thoroughly once someone finds out.) Stating that you
can take help from anywhere would have been a good clarification.

Anyway.

One of the annoying facts of the real world is that web scraping is
*hard*. We have awesome tools like Beautiful Soup that save us a lot
of hassle, but ultimately, you have to look at the results and figure
out which parts are "interesting" (that is, the parts that have the
data you want, or tell you about its structure, or something like
that). I strongly recommend messing with bs4 at the interactive
prompt; basically, just play around with everything you get hold of.
Eventually, you want to be building up a series of rows, where each
row is a list of column values; you write a row to the CSV file, and
your job's done. Most likely, you're going to have some sort of
primary loop - outside of that loop you have bs4 navigation to get you
the main block of info, and inside it, you parse through some wad of
stuff to find the truly interesting info, and at the bottom of the
loop, you write something to the CSV file.

Hope that's of some help!

ChrisA



More information about the Python-list mailing list