New to Programming - XML Processing

catperson me at anonymous.invalid
Wed Apr 1 18:33:48 EDT 2015


On Tue, 31 Mar 2015 21:17:38 -0700 (PDT), Rustom Mody
<rustompmody at gmail.com> wrote:

>On Wednesday, April 1, 2015 at 8:57:15 AM UTC+5:30, catperson wrote:
>> I am new to programming, though not new to computers.  I'm looking to
>> teach myself Python 3 and am working my way through a tutorial.  At
>> the point I'm at in the tutorial I am tasked with parsing out an XML
>> file created with a Garmin Forerunner and am just having a terrible
>> time getting my head around the concepts.  What I'm looking for is
>> some suggested reading that might give me some of the theory of
>> operation behind ElementTree and then how to parse out specific
>> elements.  Most of what I have been able to find in examples that I
>> can understand use very simplistic XML files and this Garmin file is
>> many levels of sub-elements and some of those elements have attributes
>> assigned, like <Activity Sport="Running">.
>> 
>> I'm hoping with enough reading I can experiment and work my way
>> through the problem and end up with a hopefully clear understanding of
>> the ElementTree module and Dictionairies.  
>> 
>> Thanks for any suggestions in advance.
>
>Suggestions:
>1. Learn to use the interpreter interactively; ie (at the least)¹ ie
>
>a. Start up python (without a program)
>b. Play around with trivial expressions
>c. Explore introspective features - help(), type() dir()
>
>2. Do you know about triple-quoted strings?
>a. Start small (or trivial) sub-parts of your XML as triple-quoted examples in the 
>interpreter and start throwing them at elementtree
>b. If they dont work trivialize further; if they work add complexity
>-----------
>¹ At the least because environments like Idle are more conducive to such playing

I thank everyone for their feedback.  I like the above advice and also
Rustom's other comments about learning dictionaries.  I'm thinking in
the back of my mind my issue might be more around the dictionary than
ElementTree at this point.  

 I'll fill in my situation a little based on the queries back to me.

I'm using a book, Python Programming Fundamentals by Kent D. Lee.  

http://knuth.luther.edu/~leekent/IntroToComputing/ 

It looks like he has updated his website since I last looked at it.  I
know there are many tutorials out there, however, I picked this one
and resolved to complete it start to finish.  That means a lot of side
research and reading, which I'm happy to do.  

I've got somewhat of a handle on the basic XML parsing process (my
opinion).  The book uses minidom, but my reading suggests that
ElementTree is a better option, so I thought I'd switch to that and
attempt to duplicate the exercise.

I understand this bit;

import xml.etree.ElementTree as etree
tree = etree.parse('workout.tcx')
root = tree.getroot()

and when I experiment and type root[1][1][0] in a console I can get
the ID field from the xml file with a start time for the workout. That
tells me it parsed properly.  I'm having trouble getting my head
around iterating over the elements, pulling out an activity element
with an attribute of either running or biking, then getting multiple
Trackpoint elements generated by the ForeRunner every 2 minutes, with
an ultimate goal of creating a graph using turtle graphics (later in
the exercise).

I initially thought my trouble was with parsing the xml, but now I'm
thinking my problem is I didn't pay enough attention to iterating over
a dictionary.  I will focus my efforts on Rustom's suggestions so
please don't give me a solution.  Some of my background is automotive
technician, and a statement by one of my instructors has stuck in my
mind over the years.  He said, a friends car won't start, do you take
booster cables or a can of gas with you?  If you understand the theory
of operation of an internal combustion engine, you can solve almost
any problem.

This is a small sample of the xml file.  It's 170239 lines currently.

<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<TrainingCenterDatabase
xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.garmin.com/xmlschemas/ActivityExtension/v2
http://www.garmin.com/xmlschemas/ActivityExtensionv2.xsd
http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2
http://www.garmin.com/xmlschemas/TrainingCenterDatabasev2.xsd">

  <Folders/>

  <Activities>
    <Activity Sport="Running">
      <Id>2011-06-06T12:21:04Z</Id>
      <Lap StartTime="2011-06-06T12:21:04Z">
        <TotalTimeSeconds>3283.1700000</TotalTimeSeconds>
        <DistanceMeters>4821.6821289</DistanceMeters>
        <MaximumSpeed>2.8058963</MaximumSpeed>
        <Calories>398</Calories>
        <AverageHeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
          <Value>125</Value>
        </AverageHeartRateBpm>
        <MaximumHeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
          <Value>167</Value>
        </MaximumHeartRateBpm>
        <Intensity>Active</Intensity>
        <TriggerMethod>Manual</TriggerMethod>
        <Track>
          <Trackpoint>
            <Time>2011-06-06T12:21:04Z</Time>
            <Position>
              <LatitudeDegrees>34.5225040</LatitudeDegrees>
              <LongitudeDegrees>-77.3563351</LongitudeDegrees>
            </Position>
            <AltitudeMeters>-0.3433838</AltitudeMeters>
            <DistanceMeters>0.0000000</DistanceMeters>
            <HeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
              <Value>116</Value>
            </HeartRateBpm>
            <SensorState>Absent</SensorState>
          </Trackpoint>

Jim.



More information about the Python-list mailing list