How to print out html tags excluding the attributes

sum abiut suabiut at gmail.com
Sat Jul 20 21:04:30 EDT 2019


I want to use regular expression to print out the HTML tags excluding the
attributes.

for example:

import re
html = '<h1>Hi</h1><p>test <span class="time">test</span></p>'
tags = re.findall(r'<[^>]+>', html)
for a in tags:
    print(a)


the output is :

<h1>
</h1>
<p>
<span class="time">
</span>
</p>

But I just want the tag, not the attributes

<h1>
</h1>
<p>
<span >
</span>
</p>



More information about the Python-list mailing list