From robert.rawlins at thinkbluemedia.co.uk Tue Apr 1 11:15:16 2008 From: robert.rawlins at thinkbluemedia.co.uk (Robert Rawlins) Date: Tue, 1 Apr 2008 10:15:16 +0100 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDC9A@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDC9A@dc1ex01.air.org> Message-ID: <008f01c893d8$e6c9e830$b45db890$@rawlins@thinkbluemedia.co.uk> Hi Harold, Best bet is post a small example of your XML as an attachment to an email to the list, we can then have a look through it as give you a hand. As well as the XML it would be worth giving a little background on what you want to achieve, you know, what do you want to do with the XML data. Cheers, Robert From: xml-sig-bounces at python.org [mailto:xml-sig-bounces at python.org] On Behalf Of Doran, Harold Sent: 31 March 2008 18:35 To: xml-sig at python.org Subject: [XML-SIG] Learning to use elementtree Dear List: I am brand new to xml and have some experience with python using it to parse through text files. Now, however, I need to use python to parse through some xml files. I am working with elementtree right now and am able to make this work on some toy examples. Things are going well with these toy examples. But, now I am trying to apply the code I have written to a real xml file I need to work with and things are hitting a road block. Is anyone on this able willing to look at an xml file I can send them and work with me through a small example to see if I can get this to work? I am working with python 2.5.2 for windows XP. Harold -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20080401/7ba0929a/attachment.htm From dstanek at dstanek.com Tue Apr 1 12:38:35 2008 From: dstanek at dstanek.com (David Stanek) Date: Tue, 1 Apr 2008 06:38:35 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDC9A@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDC9A@dc1ex01.air.org> Message-ID: 2008/3/31 Doran, Harold : > > But, now I am trying to apply the code I have written to a real xml file I > need to work with and things are hitting a road block. Is anyone on this > able willing to look at an xml file I can send them and work with me through > a small example to see if I can get this to work? > What is the road block? -- David http://www.traceback.org From HDoran at air.org Tue Apr 1 14:35:37 2008 From: HDoran at air.org (Doran, Harold) Date: Tue, 1 Apr 2008 08:35:37 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDCC1@dc1ex01.air.org> David et al: Attached is a sample xml file. Below is my python code. I am using python 2.5.2 on a Windows XP machine. Test.py from xml.etree.ElementTree import ElementTree as ET # create a new file defined by the user f = open('output.txt', 'w') et = ET(file='g:\python\ml\out_g3r_b2.xml') for statentityref in et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : for statval in et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s tatval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.close() If you run this you will see the output organized almost exactly as I need it. But, there is a bug in my program, which I suspect is in the order in which I am looping. For example, here is a snippet of output from the file output.txt. I've added in some comments so you can see where I am struggling. 9568 OmitCount 0.000000 # This is correct 9568 NotReachedCount 0.000000 # This is correct 9568 PolyserialCorrelation 0.602525 # This is correct 9568 AdjustedPolyserial 0.553564 # This is correct 9568 AverageScore 0.817348 # This is correct 9568 StdevItemScore 0.386381 # This is correct 9568 OmitCount 0.000000 # This is NOT correct 9568 NotReachedCount 0.000000 # This is NOT correct 9568 PolyserialCorrelation 0.672088 # This is NOT correct 9568 AdjustedPolyserial 0.590175 # This is NOT correct 9568 AverageScore 1.034195 # This is NOT correct 9568 StdevItemScore 0.926668 # This is NOT correct Now, here is what *should* be returned. Note that I have manually changed the item id (the number preceding the text) to 9569. The data are pulled in correctly, but for some reason I am not looping properly to get the correct item ID to line up with its corresponding data. 9568 OmitCount 0.000000 9568 NotReachedCount 0.000000 9568 PolyserialCorrelation 0.602525 9568 AdjustedPolyserial 0.553564 9568 AverageScore 0.817348 9568 StdevItemScore 0.386381 9569 OmitCount 0.000000 # Note the item ID has been modified here and below. 9569 NotReachedCount 0.000000 9569 PolyserialCorrelation 0.672088 9569 AdjustedPolyserial 0.590175 9569 AverageScore 1.034195 9569 StdevItemScore 0.926668 Last, notice the portion of code admin/responseanalyses/analysis/analysisdata/statentityref') I know this is what to use only because I manually went through the xml file to examine its hierarchical structure. I assume this is bad pratice. Is there a way to examine the parent-child structure of an XML file in python so I can see the hierarchical structure? Thanks, Harold > -----Original Message----- > From: David Stanek [mailto:dstanek at dstanek.com] > Sent: Tuesday, April 01, 2008 6:39 AM > To: Doran, Harold > Cc: xml-sig at python.org > Subject: Re: [XML-SIG] Learning to use elementtree > > 2008/3/31 Doran, Harold : > > > > But, now I am trying to apply the code I have written to a real xml > > file I need to work with and things are hitting a road block. Is > > anyone on this able willing to look at an xml file I can > send them and > > work with me through a small example to see if I can get > this to work? > > > > What is the road block? > > > -- > David > http://www.traceback.org > -------------- next part -------------- A non-text attachment was scrubbed... Name: out_g3r_b2.xml Type: text/xml Size: 64655 bytes Desc: out_g3r_b2.xml Url : http://mail.python.org/pipermail/xml-sig/attachments/20080401/928e2491/attachment-0001.bin From jcd at unc.edu Tue Apr 1 14:58:18 2008 From: jcd at unc.edu (J. Cliff Dyer) Date: Tue, 01 Apr 2008 08:58:18 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDCC1@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDCC1@dc1ex01.air.org> Message-ID: <1207054698.3321.9.camel@aalcdl07.lib.unc.edu> On Tue, 2008-04-01 at 08:35 -0400, Doran, Harold wrote: > David et al: > > Attached is a sample xml file. Below is my python code. I am using > python 2.5.2 on a Windows XP machine. > > Test.py > from xml.etree.ElementTree import ElementTree as ET > > # create a new file defined by the user > f = open('output.txt', 'w') > > et = ET(file='g:\python\ml\out_g3r_b2.xml') > > for statentityref in > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') > : > for statval in > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s > tatval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] > > f.close() > > If you run this you will see the output organized almost exactly as I > need it. But, there is a bug in my program, which I suspect is in the > order in which I am looping. For example, here is a snippet of output > from the file output.txt. I've added in some comments so you can see > where I am struggling. > > 9568 OmitCount 0.000000 # This is correct > 9568 NotReachedCount 0.000000 # This is correct > 9568 PolyserialCorrelation 0.602525 # This is correct > 9568 AdjustedPolyserial 0.553564 # This is correct > 9568 AverageScore 0.817348 # This is correct > 9568 StdevItemScore 0.386381 # This is correct > 9568 OmitCount 0.000000 # This is NOT correct > 9568 NotReachedCount 0.000000 # This is NOT correct > 9568 PolyserialCorrelation 0.672088 # This is NOT correct > 9568 AdjustedPolyserial 0.590175 # This is NOT correct > 9568 AverageScore 1.034195 # This is NOT correct > 9568 StdevItemScore 0.926668 # This is NOT correct > > Now, here is what *should* be returned. Note that I have manually > changed the item id (the number preceding the text) to 9569. The data > are pulled in correctly, but for some reason I am not looping properly > to get the correct item ID to line up with its corresponding data. > > 9568 OmitCount 0.000000 > 9568 NotReachedCount 0.000000 > 9568 PolyserialCorrelation 0.602525 > 9568 AdjustedPolyserial 0.553564 > 9568 AverageScore 0.817348 > 9568 StdevItemScore 0.386381 > 9569 OmitCount 0.000000 # Note the item ID has been modified > here and below. > 9569 NotReachedCount 0.000000 > 9569 PolyserialCorrelation 0.672088 > 9569 AdjustedPolyserial 0.590175 > 9569 AverageScore 1.034195 > 9569 StdevItemScore 0.926668 > > Last, notice the portion of code > > admin/responseanalyses/analysis/analysisdata/statentityref') > > I know this is what to use only because I manually went through the xml > file to examine its hierarchical structure. I assume this is bad > pratice. Is there a way to examine the parent-child structure of an XML > file in python so I can see the hierarchical structure? > > Thanks, > Harold If you keep looking in your probably massive output file, you'll also find the same results under 9569, 9567, 9571, and all your other statentityrefs. In the following code: for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref'): for statval in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], \ '\t', statval.attrib['value'] there is nothing limiting statval to within statentityref, so for each statentityref, you get all the statvals from *every* statentityref. Try something like this: for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref'): for statval in statentityref.findall('statval'): do(stuff) Note that now the xpath from which you get statval is limited to searching within the current statentityref, and takes that statentityref as its context node. Or, if you want to shorten up your code lines a bit, break out part of your xpath. analysisdata = et.findall('admin/responeanalyses/analysis/analysisdata') for statentityref in analysisdata.findall('statentityref'): for statval in statentityref.findall('statval'): do(stuff) Cheers, Cliff From HDoran at air.org Wed Apr 2 20:39:50 2008 From: HDoran at air.org (Doran, Harold) Date: Wed, 2 Apr 2008 14:39:50 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <1207054698.3321.9.camel@aalcdl07.lib.unc.edu> Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDD5F@dc1ex01.air.org> Cliff This was very helpful, thank you. I have modified the code accordingly and all is working as expected. I want to make one modification, but seem to be having some problems with generalization of the code. The current program operates as follows: xmlReader.py from xml.etree.ElementTree import ElementTree as ET filename = raw_input("Please enter the AM XML file: ") new_file = raw_input("Save this file as: ") # create a new file defined by the user f = open(new_file, 'w') et = ET(file=filename) for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : for statval in statentityref.findall('statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.close() This is based on your recommendation and works smoothly. Now, in the xml file (which I have again attached), there are other statistics nested inside admin/responseanalyses/analysis/analysisdata/statentityref that I want in addition to what is already being extracted. For example, (see snippet of xml below) the current program above pulls out the attributes for id = 13963, skips the information below it where id = 0 or id =1 and then pulls out the information for id = 13962. My goal is to extract the information where id = 0 or 1 in addition to the attribute for id=13963. - - - - - - The current output from xmlReader.py using the attached xml file looks like this (for these two IDs) 13963 OmitCount 0.000000 13963 NotReachedCount 0.000000 13963 PolyserialCorrelation 0.496309 13963 AdjustedPolyserial 0.452588 13963 AverageScore 0.981667 13963 StdevItemScore 0.134154 13962 OmitCount 0.000000 13962 NotReachedCount 0.000000 13962 PolyserialCorrelation 0.484469 13962 AdjustedPolyserial 0.425165 13962 AverageScore 0.743333 13962 StdevItemScore 0.436794 What I would like, in addition to what is already extracted, would be something like: # This is already provided 13963 OmitCount 0.000000 13963 NotReachedCount 0.000000 13963 PolyserialCorrelation 0.496309 13963 AdjustedPolyserial 0.452588 13963 AverageScore 0.981667 13963 StdevItemScore 0.134154 # This is the info nested in id=13963 and would be new # Note the dash 0 or 1 depending on which attribute provides the info 13963-0 UncollapsedMeanScore 23.863636 2.039014 13963-0 ScorePtPct 0.018333 0.003874 ... 13963-1 UncollapsedMeanScore 34.941426 0.25634 13963-1 ScorePtPct 0.981667 0.003874 and so on for all items. My modifications to code are resulting in no output being generated, so after quiete a few failures I would appreciate any advice on this. Thanks. > -----Original Message----- > From: J. Cliff Dyer [mailto:jcd at unc.edu] > Sent: Tuesday, April 01, 2008 8:58 AM > To: Doran, Harold > Cc: xml-sig at python.org > Subject: Re: [XML-SIG] Learning to use elementtree > > On Tue, 2008-04-01 at 08:35 -0400, Doran, Harold wrote: > > David et al: > > > > Attached is a sample xml file. Below is my python code. I am using > > python 2.5.2 on a Windows XP machine. > > > > Test.py > > from xml.etree.ElementTree import ElementTree as ET > > > > # create a new file defined by the user f = open('output.txt', 'w') > > > > et = ET(file='g:\python\ml\out_g3r_b2.xml') > > > > for statentityref in > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref > > ') > > : > > for statval in > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref > > /s > > tatval'): > > print >> f, statentityref.attrib['id'], '\t', > > statval.attrib['type'], '\t', statval.attrib['value'] > > > > f.close() > > > > If you run this you will see the output organized almost > exactly as I > > need it. But, there is a bug in my program, which I suspect > is in the > > order in which I am looping. For example, here is a snippet > of output > > from the file output.txt. I've added in some comments so > you can see > > where I am struggling. > > > > 9568 OmitCount 0.000000 # This is correct > > 9568 NotReachedCount 0.000000 # This is correct > > 9568 PolyserialCorrelation 0.602525 # This is correct > > 9568 AdjustedPolyserial 0.553564 # This is correct > > 9568 AverageScore 0.817348 # This is correct > > 9568 StdevItemScore 0.386381 # This is correct > > 9568 OmitCount 0.000000 # This is NOT correct > > 9568 NotReachedCount 0.000000 # This is NOT correct > > 9568 PolyserialCorrelation 0.672088 # This is NOT correct > > 9568 AdjustedPolyserial 0.590175 # This is NOT correct > > 9568 AverageScore 1.034195 # This is NOT correct > > 9568 StdevItemScore 0.926668 # This is NOT correct > > > > Now, here is what *should* be returned. Note that I have manually > > changed the item id (the number preceding the text) to > 9569. The data > > are pulled in correctly, but for some reason I am not > looping properly > > to get the correct item ID to line up with its corresponding data. > > > > 9568 OmitCount 0.000000 > > 9568 NotReachedCount 0.000000 > > 9568 PolyserialCorrelation 0.602525 > > 9568 AdjustedPolyserial 0.553564 > > 9568 AverageScore 0.817348 > > 9568 StdevItemScore 0.386381 > > 9569 OmitCount 0.000000 # Note the item ID > has been modified > > here and below. > > 9569 NotReachedCount 0.000000 > > 9569 PolyserialCorrelation 0.672088 > > 9569 AdjustedPolyserial 0.590175 > > 9569 AverageScore 1.034195 > > 9569 StdevItemScore 0.926668 > > > > Last, notice the portion of code > > > > admin/responseanalyses/analysis/analysisdata/statentityref') > > > > I know this is what to use only because I manually went through the > > xml file to examine its hierarchical structure. I assume > this is bad > > pratice. Is there a way to examine the parent-child structure of an > > XML file in python so I can see the hierarchical structure? > > > > Thanks, > > Harold > > If you keep looking in your probably massive output file, > you'll also find the same results under 9569, 9567, 9571, and > all your other statentityrefs. In the following code: > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref'): > for statval in \ > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref/statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], \ > '\t', statval.attrib['value'] > > there is nothing limiting statval to within statentityref, so > for each statentityref, you get all the statvals from *every* > statentityref. Try something like this: > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref'): > for statval in statentityref.findall('statval'): > do(stuff) > > Note that now the xpath from which you get statval is limited > to searching within the current statentityref, and takes that > statentityref as its context node. > > Or, if you want to shorten up your code lines a bit, break > out part of your xpath. > > analysisdata = > et.findall('admin/responeanalyses/analysis/analysisdata') > for statentityref in analysisdata.findall('statentityref'): > for statval in statentityref.findall('statval'): > do(stuff) > > > Cheers, > Cliff > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: out_g4r_b.xml Type: text/xml Size: 66713 bytes Desc: out_g4r_b.xml Url : http://mail.python.org/pipermail/xml-sig/attachments/20080402/ffd03348/attachment-0001.bin From jcd at unc.edu Wed Apr 2 20:46:57 2008 From: jcd at unc.edu (J. Cliff Dyer) Date: Wed, 02 Apr 2008 14:46:57 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDD5F@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDD5F@dc1ex01.air.org> Message-ID: <1207162017.7981.18.camel@aalcdl07.lib.unc.edu> On Wed, 2008-04-02 at 14:39 -0400, Doran, Harold wrote: > Cliff > > This was very helpful, thank you. I have modified the code accordingly > and all is working as expected. I want to make one modification, but > seem to be having some problems with generalization of the code. > > The current program operates as follows: > > xmlReader.py > from xml.etree.ElementTree import ElementTree as ET > > filename = raw_input("Please enter the AM XML file: ") > new_file = raw_input("Save this file as: ") > > # create a new file defined by the user > f = open(new_file, 'w') > > et = ET(file=filename) > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') > : > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] > > f.close() > > This is based on your recommendation and works smoothly. Now, in the xml > file (which I have again attached), there are other statistics nested > inside admin/responseanalyses/analysis/analysisdata/statentityref that I > want in addition to what is already being extracted. > > For example, (see snippet of xml below) the current program above pulls > out the attributes for id = 13963, skips the information below it where > id = 0 or id =1 and then pulls out the information for id = 13962. My > goal is to extract the information where id = 0 or 1 in addition to the > attribute for id=13963. > > > - > /> > > > > I don't have your code to work from, but based on what you've said, I think you might be having troubles with the difference between strings and integers. 0 == 0.000000, but "0" != "0.000000". XML is made up of strings, so unless you're converting to ints or floats before you try to match, you're going to miss on that account. Another common source of errors (at least for me :)) is not being aware of your current context node. Figure out where you are, and what's available from that point. Cheers, Cliff From HDoran at air.org Wed Apr 2 21:28:36 2008 From: HDoran at air.org (Doran, Harold) Date: Wed, 2 Apr 2008 15:28:36 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <1207162017.7981.18.camel@aalcdl07.lib.unc.edu> Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDD60@dc1ex01.air.org> Indeed, navigating the xml is tough (for me). I have been able to get the following to work. I put in "Sub Element" to indicate the new section of data. But, from looking at the text output, one doesn't know which item these sub elements belong to. I think the solution is to create an index like 13965-0 to show that this is the subinformation from the item above it. That seems to be where I am getting stuck. Although, I am open to other suggestions on how to best represent the output. from xml.etree.ElementTree import ElementTree as ET filename = raw_input("Please enter the AM XML file: ") new_file = raw_input("Save this file as: ") # create a new file defined by the user f = open(new_file, 'w') et = ET(file=filename) for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : for statval in statentityref.findall('statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.write("\n\n") f.write("Sub Element\n\n") for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s tatentityref'): for statval in statentityref.findall('statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.close() > -----Original Message----- > From: J. Cliff Dyer [mailto:jcd at unc.edu] > Sent: Wednesday, April 02, 2008 2:47 PM > To: Doran, Harold > Cc: xml-sig at python.org > Subject: Re: [XML-SIG] Learning to use elementtree > > On Wed, 2008-04-02 at 14:39 -0400, Doran, Harold wrote: > > Cliff > > > > This was very helpful, thank you. I have modified the code > accordingly > > and all is working as expected. I want to make one > modification, but > > seem to be having some problems with generalization of the code. > > > > The current program operates as follows: > > > > xmlReader.py > > from xml.etree.ElementTree import ElementTree as ET > > > > filename = raw_input("Please enter the AM XML file: ") new_file = > > raw_input("Save this file as: ") > > > > # create a new file defined by the user f = open(new_file, 'w') > > > > et = ET(file=filename) > > > > for statentityref in \ > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref > > ') > > : > > for statval in statentityref.findall('statval'): > > print >> f, statentityref.attrib['id'], '\t', > > statval.attrib['type'], '\t', statval.attrib['value'] > > > > f.close() > > > > This is based on your recommendation and works smoothly. > Now, in the > > xml file (which I have again attached), there are other statistics > > nested inside > > admin/responseanalyses/analysis/analysisdata/statentityref > that I want in addition to what is already being extracted. > > > > For example, (see snippet of xml below) the current program above > > pulls out the attributes for id = 13963, skips the > information below > > it where id = 0 or id =1 and then pulls out the information > for id = > > 13962. My goal is to extract the information where id = 0 or 1 in > > addition to the attribute for id=13963. > > > > > > - > > se="0.256340" > > /> > > > > > > > > > > I don't have your code to work from, but based on what you've > said, I think you might be having troubles with the > difference between strings and integers. 0 == 0.000000, but > "0" != "0.000000". XML is made up of strings, so unless > you're converting to ints or floats before you try to match, > you're going to miss on that account. > > Another common source of errors (at least for me :)) is not > being aware of your current context node. Figure out where > you are, and what's available from that point. > > Cheers, > Cliff > > From jcd at unc.edu Wed Apr 2 21:36:09 2008 From: jcd at unc.edu (J. Cliff Dyer) Date: Wed, 02 Apr 2008 15:36:09 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDD60@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDD60@dc1ex01.air.org> Message-ID: <1207164969.8493.3.camel@aalcdl07.lib.unc.edu> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote: > Indeed, navigating the xml is tough (for me). I have been able to get > the following to work. I put in "Sub Element" to indicate the new > section of data. But, from looking at the text output, one doesn't know > which item these sub elements belong to. I think the solution is to > create an index like 13965-0 to show that this is the subinformation > from the item above it. That seems to be where I am getting stuck. > Although, I am open to other suggestions on how to best represent the > output. > > from xml.etree.ElementTree import ElementTree as ET > > filename = raw_input("Please enter the AM XML file: ") > new_file = raw_input("Save this file as: ") > > # create a new file defined by the user > f = open(new_file, 'w') > > et = ET(file=filename) > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') > : > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] > > f.write("\n\n") > f.write("Sub Element\n\n") > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s > tatentityref'): > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] > f.close() Do you want your second statentityref loop to be based on its parent statentityref? If so, you need to nest it in the original loop, and use an xpath relative to your outer statentityref (and watch for name collisions). From HDoran at air.org Thu Apr 3 00:33:14 2008 From: HDoran at air.org (Doran, Harold) Date: Wed, 2 Apr 2008 18:33:14 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <1207164969.8493.3.camel@aalcdl07.lib.unc.edu> Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDD6A@dc1ex01.air.org> Well, I think I'm getting close. But, I think this is similar to the problem I had when I started. This seems to create a huge data file with all information under the first item, and then again all information under the second item and so forth. for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : print >> f, statentityref.attrib['id'] for statentityref in \ et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s tatentityref'): for statval in statentityref.findall('statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] I think again I may not be limiting the intended xpath. > -----Original Message----- > From: J. Cliff Dyer [mailto:jcd at unc.edu] > Sent: Wednesday, April 02, 2008 3:36 PM > To: Doran, Harold > Cc: xml-sig at python.org > Subject: Re: [XML-SIG] Learning to use elementtree > > On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote: > > Indeed, navigating the xml is tough (for me). I have been > able to get > > the following to work. I put in "Sub Element" to indicate the new > > section of data. But, from looking at the text output, one doesn't > > know which item these sub elements belong to. I think the > solution is > > to create an index like 13965-0 to show that this is the > > subinformation from the item above it. That seems to be > where I am getting stuck. > > Although, I am open to other suggestions on how to best > represent the > > output. > > > > from xml.etree.ElementTree import ElementTree as ET > > > > filename = raw_input("Please enter the AM XML file: ") new_file = > > raw_input("Save this file as: ") > > > > # create a new file defined by the user f = open(new_file, 'w') > > > > et = ET(file=filename) > > > > for statentityref in \ > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref > > ') > > : > > for statval in statentityref.findall('statval'): > > print >> f, statentityref.attrib['id'], '\t', > > statval.attrib['type'], '\t', statval.attrib['value'] > > > > f.write("\n\n") > > f.write("Sub Element\n\n") > > > > for statentityref in \ > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref > > /s > > tatentityref'): > > for statval in statentityref.findall('statval'): > > print >> f, statentityref.attrib['id'], '\t', > > statval.attrib['type'], '\t', statval.attrib['value'] > > f.close() > > Do you want your second statentityref loop to be based on its > parent statentityref? If so, you need to nest it in the > original loop, and use an xpath relative to your outer > statentityref (and watch for name collisions). > > > > From martin at v.loewis.de Sat Apr 5 22:51:30 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 05 Apr 2008 22:51:30 +0200 Subject: [XML-SIG] Content is split into two In-Reply-To: References: <1206538761.3328.3.camel@aalcdl07.lib.unc.edu> Message-ID: <47F7E652.8090504@v.loewis.de> > Wow, totally unexpected. Wonder why it's designed as it is? This is > especially weird to me since the string size isn't big (small buffer) > and this add a bit of complexity to the text processing. There are two reasons: 1. Efficiency. The parser reads a block of input into a buffer, and then parses out of this buffer. If the buffer is exhausted, it first passes the data to the application, rather than having to grow the buffer if the text content is not complete (which would involve copying the data, potentially several times). 2. Correctness. If you have an entity reference (such as © in HTML) in your input, the parser needs to tell the application what the source entity is (ie. what system and public identifier it has). If it would return all data in a single buffer, the source data would be distributed across different entities, making it impossible to refer to the source with a single URL. HTH, Martin From stefan_ml at behnel.de Mon Apr 7 14:31:55 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 07 Apr 2008 14:31:55 +0200 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDD6A@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDD6A@dc1ex01.air.org> Message-ID: <47FA143B.7010501@behnel.de> Hi, Doran, Harold wrote: > Well, I think I'm getting close. But, I think this is similar to the > problem I had when I started. This seems to create a huge data file with > all information under the first item, and then again all information > under the second item and so forth. > > for statentityref in \ > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') > : > print >> f, statentityref.attrib['id'] > for statentityref in \ > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityref/s > tatentityref'): > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] I think you should read the previous post again. You are nesting three loops here where two would do what you want. Stefan >> -----Original Message----- >> From: J. Cliff Dyer [mailto:jcd at unc.edu] >> Sent: Wednesday, April 02, 2008 3:36 PM >> To: Doran, Harold >> Cc: xml-sig at python.org >> Subject: Re: [XML-SIG] Learning to use elementtree >> >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote: >>> Indeed, navigating the xml is tough (for me). I have been >> able to get >>> the following to work. I put in "Sub Element" to indicate the new >>> section of data. But, from looking at the text output, one doesn't >>> know which item these sub elements belong to. I think the >> solution is >>> to create an index like 13965-0 to show that this is the >>> subinformation from the item above it. That seems to be >> where I am getting stuck. >>> Although, I am open to other suggestions on how to best >> represent the >>> output. >>> >>> from xml.etree.ElementTree import ElementTree as ET >>> >>> filename = raw_input("Please enter the AM XML file: ") new_file = >>> raw_input("Save this file as: ") >>> >>> # create a new file defined by the user f = open(new_file, 'w') >>> >>> et = ET(file=filename) >>> >>> for statentityref in \ >>> >> et.findall('admin/responseanalyses/analysis/analysisdata/statentityref >>> ') >>> : >>> for statval in statentityref.findall('statval'): >>> print >> f, statentityref.attrib['id'], '\t', >>> statval.attrib['type'], '\t', statval.attrib['value'] >>> >>> f.write("\n\n") >>> f.write("Sub Element\n\n") >>> >>> for statentityref in \ >>> >> et.findall('admin/responseanalyses/analysis/analysisdata/statentityref >>> /s >>> tatentityref'): >>> for statval in statentityref.findall('statval'): >>> print >> f, statentityref.attrib['id'], '\t', >>> statval.attrib['type'], '\t', statval.attrib['value'] >>> f.close() >> Do you want your second statentityref loop to be based on its >> parent statentityref? If so, you need to nest it in the >> original loop, and use an xpath relative to your outer >> statentityref (and watch for name collisions). From HDoran at air.org Tue Apr 8 15:48:05 2008 From: HDoran at air.org (Doran, Harold) Date: Tue, 8 Apr 2008 09:48:05 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <47FA143B.7010501@behnel.de> Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDDD7@dc1ex01.air.org> Thanks. I'm piecing this together slowly, but I did get the following to work. Test.py from xml.etree.ElementTree import ElementTree as ET f = open('test.txt', 'w') et = ET(file='out_g4r_b.xml') for statentityref in et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : print >> f, statentityref.attrib['id'] for statentityref in statentityref.findall('statentityref'): for statval in statentityref.findall('statval'): print >> f, statentityref.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.close() And this gives output like: 13963 0.000000 UncollapsedMeanScore 23.863636 0.000000 ScorePtPct 0.018333 0.000000 ScorePtBiserial -0.496309 0.000000 ScorePtAdjBiserial -0.452588 1.000000 UncollapsedMeanScore 34.941426 1.000000 ScorePtPct 0.981667 1.000000 ScorePtBiserial 0.496309 1.000000 ScorePtAdjBiserial 0.452588 omit ScorePtPct 0.000000 omit ScorePtBiserial -99999.990000 omit ScorePtAdjBiserial -99999.990000 13962 0.000000 UncollapsedMeanScore 29.305195 0.000000 ScorePtPct 0.256667 0.000000 ScorePtBiserial -0.484469 0.000000 ScorePtAdjBiserial -0.425165 1.000000 UncollapsedMeanScore 36.614350 1.000000 ScorePtPct 0.743333 1.000000 ScorePtBiserial 0.484469 1.000000 ScorePtAdjBiserial 0.425165 omit ScorePtPct 0.000000 omit ScorePtBiserial -99999.990000 omit ScorePtAdjBiserial -99999.990000 ... This is almost exactly what I want, and can live with this if needed. What would be most convenient, however, is to format the ouput as follows: 13963 0.000000 UncollapsedMeanScore 23.863636 13963 0.000000 ScorePtPct 0.018333 13963 0.000000 ScorePtBiserial -0.496309 13963 0.000000 ScorePtAdjBiserial -0.452588 13963 1.000000 UncollapsedMeanScore 34.941426 13963 1.000000 ScorePtPct 0.981667 13963 1.000000 ScorePtBiserial 0.496309 13963 1.000000 ScorePtAdjBiserial 0.452588 I think this may be what Cliff meant by name collusion. That is, the number 13963 comes from an attribute ['id'] in statentityref. But also, 0.000 and 1.0 are also from the id attribute in statentityref nested in statentityref. So, I'm a bit confused as to how to go about printing them out side by side. > -----Original Message----- > From: Stefan Behnel [mailto:stefan_ml at behnel.de] > Sent: Monday, April 07, 2008 8:32 AM > To: Doran, Harold > Cc: J. Cliff Dyer; xml-sig at python.org > Subject: Re: [XML-SIG] Learning to use elementtree > > Hi, > > Doran, Harold wrote: > > Well, I think I'm getting close. But, I think this is > similar to the > > problem I had when I started. This seems to create a huge data file > > with all information under the first item, and then again all > > information under the second item and so forth. > > > > for statentityref in \ > > > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref') > > : > > print >> f, statentityref.attrib['id'] > > for statentityref in \ > > > > > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref/s > > tatentityref'): > > for statval in statentityref.findall('statval'): > > print >> f, statentityref.attrib['id'], '\t', > > statval.attrib['type'], '\t', statval.attrib['value'] > > I think you should read the previous post again. You are > nesting three loops here where two would do what you want. > > Stefan > > > >> -----Original Message----- > >> From: J. Cliff Dyer [mailto:jcd at unc.edu] > >> Sent: Wednesday, April 02, 2008 3:36 PM > >> To: Doran, Harold > >> Cc: xml-sig at python.org > >> Subject: Re: [XML-SIG] Learning to use elementtree > >> > >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote: > >>> Indeed, navigating the xml is tough (for me). I have been > >> able to get > >>> the following to work. I put in "Sub Element" to indicate the new > >>> section of data. But, from looking at the text output, > one doesn't > >>> know which item these sub elements belong to. I think the > >> solution is > >>> to create an index like 13965-0 to show that this is the > >>> subinformation from the item above it. That seems to be > >> where I am getting stuck. > >>> Although, I am open to other suggestions on how to best > >> represent the > >>> output. > >>> > >>> from xml.etree.ElementTree import ElementTree as ET > >>> > >>> filename = raw_input("Please enter the AM XML file: ") new_file = > >>> raw_input("Save this file as: ") > >>> > >>> # create a new file defined by the user f = open(new_file, 'w') > >>> > >>> et = ET(file=filename) > >>> > >>> for statentityref in \ > >>> > >> > et.findall('admin/responseanalyses/analysis/analysisdata/statentityre > >> f > >>> ') > >>> : > >>> for statval in statentityref.findall('statval'): > >>> print >> f, statentityref.attrib['id'], '\t', > >>> statval.attrib['type'], '\t', statval.attrib['value'] > >>> > >>> f.write("\n\n") > >>> f.write("Sub Element\n\n") > >>> > >>> for statentityref in \ > >>> > >> > et.findall('admin/responseanalyses/analysis/analysisdata/statentityre > >> f > >>> /s > >>> tatentityref'): > >>> for statval in statentityref.findall('statval'): > >>> print >> f, statentityref.attrib['id'], '\t', > >>> statval.attrib['type'], '\t', statval.attrib['value'] > >>> f.close() > >> Do you want your second statentityref loop to be based on > its parent > >> statentityref? If so, you need to nest it in the original > loop, and > >> use an xpath relative to your outer statentityref (and > watch for name > >> collisions). > > From HDoran at air.org Tue Apr 8 16:14:17 2008 From: HDoran at air.org (Doran, Harold) Date: Tue, 8 Apr 2008 10:14:17 -0400 Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDDD7@dc1ex01.air.org> Message-ID: <2323A6D37908A847A7C32F1E3662C80E017BDDE9@dc1ex01.air.org> Well, maybe this is what I should have done to start with to avoid the name collusion problem from xml.etree.ElementTree import ElementTree as ET f = open('test.txt', 'w') et = ET(file='out_g4r_b.xml') for statentityref in et.findall('admin/responseanalyses/analysis/analysisdata/statentityref') : for ss in statentityref.findall('statentityref'): for statval in ss.findall('statval'): print >> f, statentityref.attrib['id'], ss.attrib['id'], '\t', statval.attrib['type'], '\t', statval.attrib['value'] f.close() This works and formats output as desired. Just checking to see if this is the way others would tackle this. > -----Original Message----- > From: xml-sig-bounces at python.org > [mailto:xml-sig-bounces at python.org] On Behalf Of Doran, Harold > Sent: Tuesday, April 08, 2008 9:48 AM > To: Stefan Behnel > Cc: xml-sig at python.org; J. Cliff Dyer > Subject: Re: [XML-SIG] Learning to use elementtree > > Thanks. I'm piecing this together slowly, but I did get the > following to work. > > Test.py > from xml.etree.ElementTree import ElementTree as ET f = > open('test.txt', 'w') et = ET(file='out_g4r_b.xml') for > statentityref in > et.findall('admin/responseanalyses/analysis/analysisdata/state > ntityref') > : > print >> f, statentityref.attrib['id'] > for statentityref in statentityref.findall('statentityref'): > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', > statval.attrib['type'], '\t', statval.attrib['value'] > f.close() > > And this gives output like: > > 13963 > 0.000000 UncollapsedMeanScore 23.863636 > 0.000000 ScorePtPct 0.018333 > 0.000000 ScorePtBiserial -0.496309 > 0.000000 ScorePtAdjBiserial -0.452588 > 1.000000 UncollapsedMeanScore 34.941426 > 1.000000 ScorePtPct 0.981667 > 1.000000 ScorePtBiserial 0.496309 > 1.000000 ScorePtAdjBiserial 0.452588 > omit ScorePtPct 0.000000 > omit ScorePtBiserial -99999.990000 > omit ScorePtAdjBiserial -99999.990000 > 13962 > 0.000000 UncollapsedMeanScore 29.305195 > 0.000000 ScorePtPct 0.256667 > 0.000000 ScorePtBiserial -0.484469 > 0.000000 ScorePtAdjBiserial -0.425165 > 1.000000 UncollapsedMeanScore 36.614350 > 1.000000 ScorePtPct 0.743333 > 1.000000 ScorePtBiserial 0.484469 > 1.000000 ScorePtAdjBiserial 0.425165 > omit ScorePtPct 0.000000 > omit ScorePtBiserial -99999.990000 > omit ScorePtAdjBiserial -99999.990000 > > ... > > This is almost exactly what I want, and can live with this if needed. > What would be most convenient, however, is to format the ouput as > follows: > > 13963 0.000000 UncollapsedMeanScore 23.863636 > 13963 0.000000 ScorePtPct 0.018333 > 13963 0.000000 ScorePtBiserial -0.496309 > 13963 0.000000 ScorePtAdjBiserial -0.452588 > 13963 1.000000 UncollapsedMeanScore 34.941426 > 13963 1.000000 ScorePtPct 0.981667 > 13963 1.000000 ScorePtBiserial 0.496309 > 13963 1.000000 ScorePtAdjBiserial 0.452588 > > I think this may be what Cliff meant by name collusion. That > is, the number 13963 comes from an attribute ['id'] in > statentityref. But also, 0.000 and 1.0 are also from the id > attribute in statentityref nested in statentityref. So, I'm a > bit confused as to how to go about printing them out side by side. > > > > -----Original Message----- > > From: Stefan Behnel [mailto:stefan_ml at behnel.de] > > Sent: Monday, April 07, 2008 8:32 AM > > To: Doran, Harold > > Cc: J. Cliff Dyer; xml-sig at python.org > > Subject: Re: [XML-SIG] Learning to use elementtree > > > > Hi, > > > > Doran, Harold wrote: > > > Well, I think I'm getting close. But, I think this is > > similar to the > > > problem I had when I started. This seems to create a huge > data file > > > with all information under the first item, and then again all > > > information under the second item and so forth. > > > > > > for statentityref in \ > > > > > et.findall('admin/responseanalyses/analysis/analysisdata/state > > ntityref') > > > : > > > print >> f, statentityref.attrib['id'] > > > for statentityref in \ > > > > > > > > et.findall('admin/responseanalyses/analysis/analysisdata/state > > ntityref/s > > > tatentityref'): > > > for statval in statentityref.findall('statval'): > > > print >> f, statentityref.attrib['id'], '\t', > > > statval.attrib['type'], '\t', statval.attrib['value'] > > > > I think you should read the previous post again. You are > nesting three > > loops here where two would do what you want. > > > > Stefan > > > > > > >> -----Original Message----- > > >> From: J. Cliff Dyer [mailto:jcd at unc.edu] > > >> Sent: Wednesday, April 02, 2008 3:36 PM > > >> To: Doran, Harold > > >> Cc: xml-sig at python.org > > >> Subject: Re: [XML-SIG] Learning to use elementtree > > >> > > >> On Wed, 2008-04-02 at 15:28 -0400, Doran, Harold wrote: > > >>> Indeed, navigating the xml is tough (for me). I have been > > >> able to get > > >>> the following to work. I put in "Sub Element" to > indicate the new > > >>> section of data. But, from looking at the text output, > > one doesn't > > >>> know which item these sub elements belong to. I think the > > >> solution is > > >>> to create an index like 13965-0 to show that this is the > > >>> subinformation from the item above it. That seems to be > > >> where I am getting stuck. > > >>> Although, I am open to other suggestions on how to best > > >> represent the > > >>> output. > > >>> > > >>> from xml.etree.ElementTree import ElementTree as ET > > >>> > > >>> filename = raw_input("Please enter the AM XML file: ") > new_file = > > >>> raw_input("Save this file as: ") > > >>> > > >>> # create a new file defined by the user f = open(new_file, 'w') > > >>> > > >>> et = ET(file=filename) > > >>> > > >>> for statentityref in \ > > >>> > > >> > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityre > > >> f > > >>> ') > > >>> : > > >>> for statval in statentityref.findall('statval'): > > >>> print >> f, statentityref.attrib['id'], '\t', > > >>> statval.attrib['type'], '\t', statval.attrib['value'] > > >>> > > >>> f.write("\n\n") > > >>> f.write("Sub Element\n\n") > > >>> > > >>> for statentityref in \ > > >>> > > >> > > > et.findall('admin/responseanalyses/analysis/analysisdata/statentityre > > >> f > > >>> /s > > >>> tatentityref'): > > >>> for statval in statentityref.findall('statval'): > > >>> print >> f, statentityref.attrib['id'], '\t', > > >>> statval.attrib['type'], '\t', statval.attrib['value'] > > >>> f.close() > > >> Do you want your second statentityref loop to be based on > > its parent > > >> statentityref? If so, you need to nest it in the original > > loop, and > > >> use an xpath relative to your outer statentityref (and > > watch for name > > >> collisions). > > > > > _______________________________________________ > XML-SIG maillist - XML-SIG at python.org > http://mail.python.org/mailman/listinfo/xml-sig > From stefan_ml at behnel.de Tue Apr 8 16:14:17 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 8 Apr 2008 16:14:17 +0200 (CEST) Subject: [XML-SIG] Learning to use elementtree In-Reply-To: <2323A6D37908A847A7C32F1E3662C80E017BDDD7@dc1ex01.air.org> References: <2323A6D37908A847A7C32F1E3662C80E017BDDD7@dc1ex01.air.org> Message-ID: <47534.194.114.62.67.1207664057.squirrel@groupware.dvs.informatik.tu-darmstadt.de> Doran, Harold wrote: > for statentityref in statentityref.findall('statentityref'): In this line, you assign new values to the name "statentityref" in each iteration of the loop. > for statval in statentityref.findall('statval'): > print >> f, statentityref.attrib['id'], '\t', Here, you access the attribute "id" of the element you referenced by that name, which is different in each iteration. But only the first element seems to have a meaningful value for this attribute. You should a) read the Python tutorial, especially the chapter on loops, and b) assign the value of the "id" attribute to a variable outside the loop and print it inside the loop Stefan From devsfan1830 at gmail.com Sat Apr 12 16:52:08 2008 From: devsfan1830 at gmail.com (James Duffy) Date: Sat, 12 Apr 2008 10:52:08 -0400 Subject: [XML-SIG] Hoping someone can help with this small problem Message-ID: <000101c89cac$c8d09540$5a71bfc0$@com> I'm currently working on a project and have hit a small jam. I need to be able to produce, send, receive, and parse a stream of XML files. I've got the production and parser setup, however I need a python script that will open a connection to another machine and maintain that connection for the duration of the session, during which XML files are sent from one computer to another via Ethernet. For our basic needed setup we are going to use a crossover cable to connect two machines. I'm hoping someone has written or seen something that can help me accomplish this. Time is short so I would appreciate any quick and helpful responses. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/xml-sig/attachments/20080412/bb433fbb/attachment.htm From stefan_ml at behnel.de Sat Apr 12 18:33:08 2008 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Apr 2008 18:33:08 +0200 Subject: [XML-SIG] Hoping someone can help with this small problem In-Reply-To: <000101c89cac$c8d09540$5a71bfc0$@com> References: <000101c89cac$c8d09540$5a71bfc0$@com> Message-ID: <4800E444.5000703@behnel.de> James Duffy wrote: > I'm currently working on a project and have hit a small jam. I need to be > able to produce, send, receive, and parse a stream of XML files. I've got > the production and parser setup, however I need a python script that will > open a connection to another machine and maintain that connection for the > duration of the session, during which XML files are sent from one computer > to another via Ethernet. Via Ethernet? As plain Ethernet frames? I bet you have a bit more of a protocol stack available on top of that, such as IP/TCP/HTTP or something. > For our basic needed setup we are going to use a > crossover cable to connect two machines. I'm hoping someone has written or > seen something that can help me accomplish this. Time is short so I would > appreciate any quick and helpful responses. Any help here requires a bit more detail from your side, such as the type of machines (PCs? embedded devices?), the operating system, the available software, the supported network protocols, the kind of "XML files" you want to send, how they are generated (based on what kind of data), ... Maybe XML-RPC is of interest here? http://tldp.org/HOWTO/XML-RPC-HOWTO/xmlrpc-howto-python.html http://docs.python.org/lib/module-xmlrpclib.html http://docs.python.org/lib/module-SimpleXMLRPCServer.html Stefan From devsfan1830 at gmail.com Sat Apr 12 18:47:10 2008 From: devsfan1830 at gmail.com (James Duffy) Date: Sat, 12 Apr 2008 12:47:10 -0400 Subject: [XML-SIG] Hoping someone can help with this small problem In-Reply-To: <4800E444.5000703@behnel.de> References: <000101c89cac$c8d09540$5a71bfc0$@com> <4800E444.5000703@behnel.de> Message-ID: <000001c89cbc$da9c4c00$8fd4e400$@com> Actually I found some code online for file transfer using the socket class. So I'm going to fiddle with that to work with my existing project code. My major hurdle now is getting two computers to connect with a crossover cable. And sorry, I forgot to mention I'm using Ubuntu 7.10 on a laptop and a desktop (hoping to get a laptop to laptop setup working instead). I appreciate the fast response. -----Original Message----- From: Stefan Behnel [mailto:stefan_ml at behnel.de] Sent: Saturday, April 12, 2008 12:33 PM To: James Duffy Cc: xml-sig at python.org Subject: Re: [XML-SIG] Hoping someone can help with this small problem James Duffy wrote: > I'm currently working on a project and have hit a small jam. I need to be > able to produce, send, receive, and parse a stream of XML files. I've got > the production and parser setup, however I need a python script that will > open a connection to another machine and maintain that connection for the > duration of the session, during which XML files are sent from one computer > to another via Ethernet. Via Ethernet? As plain Ethernet frames? I bet you have a bit more of a protocol stack available on top of that, such as IP/TCP/HTTP or something. > For our basic needed setup we are going to use a > crossover cable to connect two machines. I'm hoping someone has written or > seen something that can help me accomplish this. Time is short so I would > appreciate any quick and helpful responses. Any help here requires a bit more detail from your side, such as the type of machines (PCs? embedded devices?), the operating system, the available software, the supported network protocols, the kind of "XML files" you want to send, how they are generated (based on what kind of data), ... Maybe XML-RPC is of interest here? http://tldp.org/HOWTO/XML-RPC-HOWTO/xmlrpc-howto-python.html http://docs.python.org/lib/module-xmlrpclib.html http://docs.python.org/lib/module-SimpleXMLRPCServer.html Stefan From tomkit at gmail.com Thu Apr 24 08:53:21 2008 From: tomkit at gmail.com (Tom Chen) Date: Wed, 23 Apr 2008 23:53:21 -0700 Subject: [XML-SIG] Installation Question Message-ID: <6122ad350804232353s3cc10664i314e67fdf530f9f4@mail.gmail.com> Hi, I am doing the following in cygwin to install PyXML (I obtained the tar from sourceforge): $ python setup.py build running build running build_py running build_ext running build_scripts $ python setup.py install running install running build running build_py running build_ext running build_scripts running install_lib running install_scripts changing mode of /usr/bin/xmlproc_parse to 755 changing mode of /usr/bin/xmlproc_val to 755 running install_data running install_egg_info Removing /usr/lib/python2.5/site-packages/PyXML-0.8.4-py2.5.egg-info Writing /usr/lib/python2.5/site-packages/PyXML-0.8.4-py2.5.egg-info Yet when I try to run a Makefile for something that requires PyXML I get the following error: $ make . . . --- Building binary src/cx_Freeze-3.0.3/FreezePython --target-dir dist --include-modules \ encodings.string_escape,\ File "C:\Python25\Lib\modulefinder.py", line 191, in load_tail raise ImportError, "No module named " + mname ImportError: No module named xml.sax.drivers2 make: *** [dist] Error 1 I am wondering if there is some kind of path I need to set? Installation seemed to be successful (the above logs are from subsequent runs of the installation/build scripts; I've ran them before and am running it again here to reproduce the output from them). Thanks, --Thomas From martin at v.loewis.de Thu Apr 24 16:14:18 2008 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 24 Apr 2008 16:14:18 +0200 Subject: [XML-SIG] Installation Question In-Reply-To: <6122ad350804232353s3cc10664i314e67fdf530f9f4@mail.gmail.com> References: <6122ad350804232353s3cc10664i314e67fdf530f9f4@mail.gmail.com> Message-ID: <481095BA.6070408@v.loewis.de> > I am wondering if there is some kind of path I need to set? > Installation seemed to be successful (the above logs are from > subsequent runs of the installation/build scripts; I've ran them > before and am running it again here to reproduce the output from > them). The issue probably is that the module is really called _xmlplus.sax.drivers2. PyXML plays tricks with the import machinery to install itself as the xml package, even though Python already ships with one. It's quite possible that cx_Freeze cannot deal with that setup. Regards, Martin