[Tutor] Using Beautiful Soup to extract tag names

Tue Mar 14 17:05:26 CET 2006

As always Kent, you're amazing.

That will do perfectly.  (Though the ElementTree documentation seems a
bit difficult to get through.  I'm sure I'll get through it
eventually).

Thanks

Ed

On 14/03/06, Kent Johnson <kent37 at tds.net> wrote:
> Ed Singleton wrote:
> > I have (unfortunately) received some data in XML format.  I need to
> > use it in Python, preferably as a list of dictionaries.  The data is a
> > flat representation of a table, in the style:
> >
> > <tablename>
> > <fieldname1>Some Data</fieldname1>
> > <fieldname2>Some Data</fieldname>
> > ...
> > </tablename>
> > <tablename>
> > <fieldname1>Some Data</fieldname1>
> > <fieldname2>Some Data</fieldname>
> > ...
> >
> > and so on (where tablename is always the same in one file).
>
> ElementTree makes short work of this:
>
> from elementtree import ElementTree
>
> xml = '''
> <data><tablename>
> <fieldname1>Some Data1</fieldname1>
> <fieldname2>Some Data2</fieldname2>
> </tablename>
> <tablename>
> <fieldname3>Some Data3</fieldname3>
> <fieldname4>Some Data4</fieldname4>
> </tablename>
> </data>'''
>
> doc = ElementTree.fromstring(xml)
> # use ElementTree.parse() to parse a file
>
> for table in doc.findall('tablename'):
>      for field in table.getchildren():
>          print field.tag, field.text
>
>
> prints:
> fieldname1 Some Data1
> fieldname2 Some Data2
> fieldname3 Some Data3
> fieldname4 Some Data4
>
> If speed is an issue then look at cElementTree which has the same
> interface and is blazingly fast.
> http://effbot.org/zone/element.htm
>
> Kent
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>