How to print out html tags excluding the attributes

Michael F. Stemper michael.stemper at gmail.com
Sun Jul 21 16:41:45 EDT 2019


On 20/07/2019 20.04, sum abiut wrote:
> I want to use regular expression to print out the HTML tags excluding the
> attributes.
> 
> for example:
> 
> import re
> html = '<h1>Hi</h1><p>test <span class="time">test</span></p>'
> tags = re.findall(r'<[^>]+>', html)
> for a in tags:
>     print(a)
> 
> 
> the output is :
> 
> <h1>
> </h1>
> <p>
> <span class="time">
> </span>
> </p>
> 
> But I just want the tag, not the attributes

Try this:

for a in tags:
    a = re.sub( " .*>", ">", a )
    print(a)

(The two statements could be combined.)

-- 
Michael F. Stemper
Galatians 3:28



More information about the Python-list mailing list