delete from pattern to pattern if it contains match

harirammanohar at gmail.com harirammanohar at gmail.com
Mon Apr 25 05:49:02 EDT 2016


On Monday, April 25, 2016 at 12:47:14 PM UTC+5:30, Jussi Piitulainen wrote:
> harirammanohar at gmail.com writes:
> 
> > Hi Jussi,
> >
> > i have seen you have written a definition to fulfill the requirement,
> > can we do this same thing using xml parser, as i have failed to
> > implement the thing using xml parser of python if the file is having
> > the content as below...
> >
> > <!DOCTYPE web-app 
> >     PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" 
> >     "http://java.sun.com/dtd/web-app_2_3.dtd">
> >
> > <web-app>
> >
> > and entire thing works if it has as below:
> > <!DOCTYPE web-app 
> > <web-app>
> >
> > what i observe is xml tree parsing is not working if http tags are
> > there in between web-app...
> 
> Do you get an error message?
> 
> My guess is that the parser needs the DTD but cannot access it. There
> appears to be a DTD at that address, http://java.sun.com/... (it
> redirects to Oracle, who bought Sun a while ago), but something might
> prevent the parser from accessing it by default. If so, the details
> depend on what parser you are trying to use. It may be possible to save
> that DTD as a local file and point the parser to that.
> 
> Your problem is morphing rather wildly. A previous version had namespace
> declarations but no DTD or XSD if I remember right. The initial version
> wasn't XML at all.
> 
> If you post (1) an actual, minimal document, (2) the actual Python
> commands that fail to parse it, and (3) the error message you get,
> someone will be able to help you. The content of the document need not
> be more than "hello, world" level. The DOCTYPE declaration and the
> outermost tags with all their attributes and namespace declarations, if
> any, are important.

Hi Jussi,

Here is an input file...sample.xml

<?xml version="1.0" encoding="ISO-8859-1"?>
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
                      http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"
  version="3.1">
    <servlet>
      <servlet-name>controller</servlet-name>
      <servlet-class>com.mycompany.mypackage.ControllerServlet</servlet-class>
      <init-param>
        <param-name>listOrders</param-name>
        <param-value>com.mycompany.myactions.ListOrdersAction</param-value>
      </init-param>
      <init-param>
        <param-name>saveCustomer</param-name>
        <param-value>com.mycompany.myactions.SaveCustomerAction</param-value>
      </init-param>
      <load-on-startup>5</load-on-startup>
    </servlet>


    <servlet-mapping>
      <servlet-name>graph</servlet-name>
      <url-pattern>/graph</url-pattern>
    </servlet-mapping>


    <session-config>
      <session-timeout>30</session-timeout>
    </session-config>
</web-app>

--------------------------------
Here is the code:

import xml.etree.ElementTree as ET
ET.register_namespace("", "http://xmlns.jcp.org/xml/ns/javaee")
tree = ET.parse('sample.xml')
root = tree.getroot()

for servlet in root.findall('servlet'):
        servletname = servlet.find('servlet-name').text
        if servletname == "controller":
                root.remove(servlet)

tree.write('output.xml')

This will work if <web-app> </web-app> doesnt have below...

xmlns="http://xmlns.jcp.org/xml/ns/javaee"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
                      http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"



More information about the Python-list mailing list