Tag objects in Beautiful Soup
Peter Otten
__peter__ at web.de
Thu Nov 20 10:02:08 EST 2014
Simon Evans wrote:
> Re:'Accessing the Tag object from Beautiful Soup' (page 22-25 - Getting
> Started with Beautiful Soup) So far the code to python27 runs as given in
> the book, re: -
>
----------------------------------------------------------------------------
>>>> html_atag = """<html><body><p>Test html a tag example</p>
> ... <a href="http://www.packtpub.com'>Home</a>
> ... <a href="http;//www.packtpub.com/books'.Books</a>
> ... </body>
> ... </html>"""
>>>> soup = BeautifulSoup(html_atag,'lxml')
>>>> atag = soup.a
>>>> print(atag)
> <a href="http://www.packtpub.com'>Home</a>
> <a href=" http="">
> </a>
>>>> type(atag)
> <class 'bs4.element.Tag'>
>>>>
>>>> tagname = atag.name
>>>> print tagname
> a
>>>> atag.name = 'p'
>>>> print (soup)
> <html><body><p>Test html a tag example</p>
> <p href="http://www.packtpub.com'>Home</a>
> <a href=" http="">
> </p></body>
> </html>
>
----------------------------------------------------------------------------
> then under the next Sub heading : 'Attributes of a Tag object'
> text reads :
There is no assignment
soup_atag = whatever
but there is one to atag. The whole session should when you omit the
offending line
> atag = soup_atag.a
or insert
soup_atag = soup
before it.
> print (atag['href'])
>
> #output
> http://www.packtpub.com
>
> however when I put this code to the console I get error returns at the
> first line re:-
>
----------------------------------------------------------------------------
>>>> atag = soup_atag.a
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> NameError: name 'soup_atag' is not defined
>>>>
>
----------------------------------------------------------------------------
> Can anyone tell me where I am going wrong or where the text is wrong ?
> So far the given code has run okay, I have put to the console everything
> the text tells you to. Thank you for reading.
> Simon Evans
More information about the Python-list
mailing list