[Tutor] python and Beautiful soup question

Mark Lawrence breamoreboy at yahoo.co.uk
Mon Jun 22 15:39:07 CEST 2015


On 22/06/2015 02:41, Alex Kleider wrote:
> On 2015-06-21 15:55, Mark Lawrence wrote:
>> On 21/06/2015 21:04, Joshua Valdez wrote:
>>> I'm having trouble making this script work to scrape information from a
>>> series of Wikipedia articles.
>>>
>>> What I'm trying to do is iterate over a series of wiki URLs and pull out
>>> the page links on a wiki portal category (e.g.
>>> https://en.wikipedia.org/wiki/Category:Electronic_design).
>>>
>>> I know that all the wiki pages I'm going through have a page links
>>> section.
>>> However when I try to iterate through them I get this error message:
>>>
>>> Traceback (most recent call last):
>>>    File "./wiki_parent.py", line 37, in <module>
>>>      cleaned = pages.get_text()AttributeError: 'NoneType' object has no
>>> attribute 'get_text'
>>
>> Presumably because this line
>>
>>>      pages = soup.find("div" , { "id" : "mw-pages" })
>>
>> doesn't find anything, pages is set to None and hence the attribute
>> error on the next line.  I'm suspicious of { "id" : "mw-pages" } as
>> it's a Python dict comprehension with one entry of key "id" and value
>> "mw-pages".
>
> Why do you refer to { "id" : "mw-pages" } as a dict comprehension?
> Is that what a simple dict declaration is?
>

No, I'm simply wrong, it's just a plain dict.  Please don't ask, as I've 
no idea how it got into my head :)

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence



More information about the Tutor mailing list