[XML-SIG] python, xml, html tags

Necati DEMiR ndemir at demir.web.tr
Tue Mar 29 17:04:58 CEST 2005


Uche Ogbuji wrote:

>On Mon, 2005-03-28 at 21:01 +0300, Necati DEMiR wrote:
>  
>
>>Hi,
>>I can't do something with Python and XML.
>>
>>i have the following file;
>>
>><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>> <test>
>>  <content> Hello </content>
>>  <content> <b> Hello </b> </content>
>> </test>
>>
>>Ok. it is simple :)
>>
>>And i have the following python codes;
>>
>>#!/usr/bin/python
>>from xml.dom import minidom
>>
>>file = open("test.xml","r")
>>xml = minidom.parse(file)
>>print xml.childNodes[0].getElementsByTagName("content")[0].firstChild.data
>>print xml.childNodes[0].getElementsByTagName("content")[1].firstChild.data
>>
>>Again simple one :)
>>
>>But when i run these codes, i have the following output;
>>Hello
>>
>>How can i access the second one.
>>    
>>
>
>DOM is not very good for this sort of thing.  You could do:
>
>print xml.getElementsByTagName("content")[0].firstChild.data
>print xml.getElementsByTagName("content")[1].getElementsByTagName
>("b").firstChild.data
>
>But that's silly :-)
>
>More useful thoughts below...
>
>  
>
>>Yes, i know it contains html tags so it 
>>doesn't give me the result.
>>    
>>
>
>Your b element happens to have the same name as one used in HTML, but
>that doesn't really make it an HTML tag.  In this case, it's clearly an
>XML tag.
>
>  
>
>>I wanna get whole of the content as data. 
>>How can i do this?
>>    
>>
>
>Use something like the string_value function, listing 5 of the following
>article:
>
>http://www.xml.com/pub/a/2003/01/08/py-xml.html
>
>Or use something with XPath support, which makes this easy.  Using Amara
>( http://www.xml.com/pub/a/2005/01/19/amara.html ), your code would be
>
>from amara import binderytools
>doc = binderytools.bind_file("test.xml")
>print doc.xml_xpath(u'string(//content[1])')
>print doc.xml_xpath(u'string(//content[2])')
>
>Which prints
>
> Hello
>  Hello
>  
>
Thanks. But i want the output as the following;

Hello
<b> Hello </b>

-- 
---------------------------------------------------
Necati DEMiR
http://demir.web.tr

ndemir at demir.web.tr
0xAF7F745F
8525 9E78 F4EB 1DC6 51F3  1913 BB0D 7D9A AF7F 745F
---------------------------------------------------



More information about the XML-SIG mailing list