Screen scraper to get all 'a title' elements

Denis McMahon denismfmcmahon at gmail.com
Thu Nov 26 09:49:39 EST 2015


On Wed, 25 Nov 2015 12:42:00 -0800, ryguy7272 wrote:

> Hello experts.  I'm looking at this url:
> https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names
> 
> I'm trying to figure out how to list all 'a title' elements.

a is the element tag, title is an attribute of the htmlanchorelement.

combining bs4 with python structures allows you to find all the specified 
attributes of an element type, for example to find the class attributes 
of all the paragraphs with a class attribute:

stuff = [p.attrs['class'] for p in soup.find_all('p') if 'class' in 
p.attrs]

Then you can do this

for thing in stuff:
    print thing

(Python 2.7)

This may be adaptable to your requirement.

-- 
Denis McMahon, denismfmcmahon at gmail.com



More information about the Python-list mailing list