[Tutor] python and Beautiful soup question

Alex Kleider akleider at sonic.net
Mon Jun 22 03:41:54 CEST 2015


On 2015-06-21 15:55, Mark Lawrence wrote:
> On 21/06/2015 21:04, Joshua Valdez wrote:
>> I'm having trouble making this script work to scrape information from 
>> a
>> series of Wikipedia articles.
>> 
>> What I'm trying to do is iterate over a series of wiki URLs and pull 
>> out
>> the page links on a wiki portal category (e.g.
>> https://en.wikipedia.org/wiki/Category:Electronic_design).
>> 
>> I know that all the wiki pages I'm going through have a page links 
>> section.
>> However when I try to iterate through them I get this error message:
>> 
>> Traceback (most recent call last):
>>    File "./wiki_parent.py", line 37, in <module>
>>      cleaned = pages.get_text()AttributeError: 'NoneType' object has 
>> no
>> attribute 'get_text'
> 
> Presumably because this line
> 
>>      pages = soup.find("div" , { "id" : "mw-pages" })
> 
> doesn't find anything, pages is set to None and hence the attribute
> error on the next line.  I'm suspicious of { "id" : "mw-pages" } as
> it's a Python dict comprehension with one entry of key "id" and value
> "mw-pages".

Why do you refer to { "id" : "mw-pages" } as a dict comprehension?
Is that what a simple dict declaration is?



More information about the Tutor mailing list