Help extracting info from HTML source ..

s. d. rose s_david_rose at hotmail.com
Thu Jan 25 20:52:40 EST 2007


Hello All.
  I am learning Python, and have never worked with HTML.  However, I would 
like to write a simple script to audit my 100+ Netware servers via their web 
portal.

  I was reading Chapter 8 of Dive into Python, which deals with this topic. 
In the web portal of the server, there is a section similar to this:

  -->  clients and <A 
href="http://eugenia.blogsome.com/?s=ipkall">clever</a> services. <--

which I took from SlashDot, but what I'm talking about is using the word 
'services' to represent the link to eugenia.blogsome.com.

What I'd like to do is save the two pieces of info relative to the server 
name.  Probably in a dictionary, such as server1[link] to the page on 
eugenia.blogsome.com and server1[description] to 'services'.

I've used the example from Dive into Python to get the actual link in the 
source of the HTML, but I don't know how to get the text that is the 
hyperlink.

So in the portal, I've got a link 'Scheduled Server Reboot' going to say 
/ScheduledTasks/ID000000003/ on Server1, using similar to above clipped HTML 
source code.

Can someone please help me?  Sure, I could manually go to each server, but I 
wouldn't learn anything.  I've learned some, but also have real deadlines, 
so I eagerly hope for any assistance & instruction.

Thank you!
-Dave
Shelton, CT 






More information about the Python-list mailing list