Download unnamed web image?

Tue Feb 16 22:14:33 EST 2010

galileo228 wrote:
> On Feb 16, 9:40 pm, galileo228 <mattbar... at gmail.com> wrote:
>> On Feb 16, 8:48 pm, John Bokma <j... at castleamber.com> wrote:
>>
>>
>>
>>> galileo228 <mattbar... at gmail.com> writes:
>>>> Using BeautifulSoup, mechanize, and urllib, I've constructed the
>>>> following:
>>>> br.open("http://www.school.edu/students/facebook/")
>>>> br.select_form(nr = 1)
>>>> br.form['fulltextsearch'] = 'msb83' # this searches the facebook for
>>>> me
>>>> br.submit()
>>>> results = br.response().read()
>>>> soup = BeautifulSoup(results)
>>>> foo2 = soup.find('td', attrs={'width':'95'})
>>>> foo3 = foo2.find('a')
>>>> foo4 = foo3.find('img', attrs={'src':'deliverImage.cfm?netid=msb83'})
>>>> # this just drills down to the <img> line and   until this point the
>>>> program does not return an error
>>>> save_as = os.path.join('./', msb83 + '.jpg')
>>>> urllib.urlretrieve(foo4, save_as)>
>>>> I get the following error msg after running this code:
>>>> AttributeError: 'NoneType' object has no attribute 'strip'
>>> Wild guess, since you didn't provide line numbers, etc.
>>> foo4 is None
>>> (I also would like to suggest to use more meaningful names)
>>> --
>>> John Bokma                                                               j3b
>> I thought it was too, and I just doublechecked.  It's actually
>>
>> foo3 = foo2.find('a')
>>
>> that is causing the NoneType error.
>>
>> Thoughts?
> 
> I've now fixed the foo3 issue, and I now know that the problem is with
> the urllib.urlretrieve line (see above). This is the error msg I get
> in IDLE:
> 
> Traceback (most recent call last):
>   File "/Users/Matt/Documents/python/dtest.py", line 59, in <module>
>     urllib.urlretrieve(foo4, save_as)
>   File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
> python2.6/urllib.py", line 94, in urlretrieve
>     return _urlopener.retrieve(url, filename, reporthook, data)
>   File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
> python2.6/urllib.py", line 226, in retrieve
>     url = unwrap(toBytes(url))
>   File "/Library/Frameworks/Python.framework/Versions/2.6/lib/
> python2.6/urllib.py", line 1033, in unwrap
>     url = url.strip()
> TypeError: 'NoneType' object is not callable
> 
> Is this msg being generated because I'm trying to retrieve a url
> that's not really a file?

It's because the URL you're passing in, namely foo4, is None. This is
presumably because foo3.find() returns None if it can't find the entry.

You checked the value of foo3, but did you check the value of foo4?