BeautifulSoup help !!

alister alister.ware at ntlworld.com
Thu Oct 6 12:04:08 EDT 2016


On Thu, 06 Oct 2016 08:50:25 -0700, Navneet Siddhant wrote:

> On Thursday, October 6, 2016 at 9:00:21 PM UTC+5:30, alister wrote:
>> On Thu, 06 Oct 2016 08:22:05 -0700, desolate.soul.me wrote:
>> 
>> > So I've just started up with python and an assignment was given to me
>> > by a company as an recruitment task.
>> >
>> so by your own admission you have just started with python yet you
>> consider your self suitable for employment?
>> 
>> 
>> --
>> "Unibus timeout fatal trap program lost sorry"
>> - An error message printed by DEC's RSTS operating system for the
>> PDP-11
> 
> 
> yup ... training will be provided further , all they want to confirm is
> atleast I have basic knowledge of how the language works and they wont
> have to tell me how to install python on the system or any further
> extensions to it. Im not quite as to why they provided me with this
> assignment when my cv clearly states that im good with .net and not
> python.
> 
> I hope you could help here and not just trolling around.

well if this is just a start of training it is a bit different, we dont 
like writing code for homework assignments (even less for comercial 
assignments) but can give some general pointers.

you code currently gets the div from the soup object of the page & as you 
say it contains all of the HTML (it is also a soup object)
what you need to do is use another find_all to extract the elements that 
contain the information you require. 

depending on the page layout you may need to drill down multiple levels & 
combine multiple different elements to get what you require

performing the tasks manually with the page html source before automating 
is the usual approach.
 



-- 
It turned out that the worm exploited three or four different holes in the
system.  From this, and the fact that we were able to capture and examine
some of the source code, we realized that we were dealing with someone 
very
sharp, probably not someone here on campus.
		-- Dr. Richard LeBlanc, associate professor of ICS, in
		   Georgia Tech's campus newspaper after the Internet 
worm.



More information about the Python-list mailing list