BeautifulSoup help !!

Michael Torrie torriem at gmail.com
Thu Oct 6 14:20:50 EDT 2016


On 10/06/2016 11:34 AM, Navneet Siddhant wrote:
> I guess I will have to extract data from multiple divs as only
> extracting data from the parent div which has other divs in it with
> the different data is coming up all messed up. Will play around and
> see if I could get through it. Let me clarify once again I dont need
> complete code , a resource where I could find more info about using
> Beautifulsoup will be appreciated.  Also do I need some kind of
> plugin etc to extract data to csv ? or it is built in python and I
> could simply import csv and write other commands needed ??

Writing CSV files from Python is relatively easy.  Though you may need
to understand basic Python data types first, such as lists and dicts.
The module itself is mostly documented here:

https://docs.python.org/3/library/csv.html

And there are numerous examples of its use:
https://pymotw.com/2/csv/
https://www.getdatajoy.com/examples/python-data-analysis/read-and-write-a-csv-file-with-the-csv-module

To name but two of the first of many google search results.

Sounds to me like you need to spend some time learning basic Python data
types and how to iterate through them (lists and dicts mainly).  BS4
uses both lists and dicts for nearly everything.  An hour or two should
be enough to get a handle on this.  The nice thing about Python is you
can build things and run them in an incremental, and mostly interactive
way.  Regularly print out things so you can see what structure the data
has.  BS4 is very good about string representations of all its
structures you can print them out without knowing anything about them.
For example, if you were searching for a tag:

results = soup.find('a', attrs = {'data-search-key': "name" })

you can just do:
print (results)

And easily see how things are nested. Then you can use that to drill
down using list indexing to get just the part you need.

I suspect if they hire you and you work more on Python you'll grow to
really like it.

I suppose that Alister expressed consternation because much of what you
ask can be solved with some good old fashioned searching of google and
is no different from learning any language, including C# which you
already know.  Python is not C# of course, but the basic principles
behind programming are nearly universally-applied to nearly all
programming languages.  For a programmer, it shouldn't be too hard to
move from language to language.  Except for some of the more idiomatic
things about a language, many aspects are syntactically equivalent.



More information about the Python-list mailing list