XML & Python

Bas van Gils bas.vangils at home.nl
Wed Jan 9 09:20:07 EST 2002


On Wed, Jan 09, 2002 at 11:11:03AM -0000, Hugo Martires wrote:
> Any one knows anithing about integrate XML with python ?
> How can i work with both and take advantage of it ?

For my graduation thesis I used both. From that experience I can tell
you that I'm quite happy with it:-) I used the basic documentation
online:

    http://www.python.org/doc/current/lib/module-xmllib.html

I subclassed xmllib.XMLParser to complete my task. It selects:

    - all data between xml-tags
    - all the values for name-attributes of xml-tags.

The class that I wrote was this:

    import xmllib
    class Parser(xmllib.XMLParser):
        """specialised parser for XML-files

        the load(file) method reads a file one line at a time,
        feeds it to the inherrited XMLParser. 

        the handle_data() method gets rid of needless whitespace
        and punctions before storing all words in  the XML-spec
        as well as the values for name-attributes

        the getData() method returns the stored words (list)"""

        def __init__(self):
            """__init__()
            constructor initialises the parser
            and self.__data"""

            xmllib.XMLParser.__init__(self)
            self.__data = []

        def load(self,file):
            """load(file)
            load the file, read it one line at a time
            feed each line to the parser"""

            while 1:
                s = file.readline()
                if not s:
                    # no more lines was available
                    break
                # feed the line to the parser
                self.feed(s)
            self.close()

        def start_class(self,attrs):
            """handler for class start-tags
            stores the name-attribute in this start-tag"""
            try:
                self.__data.append(attrs["name"])
            except:
                print "Error in method start_tag(): \n\t no attribute 'name' found"

        def handle_data(self,data):
            """handler for data (stuff between xml-tags)
            only stores data of none-zero length
            also converts whitespace to a single space-character"""

            if len(data) <> 0:
                # get rid of needless whitespace
                # and punctuation ("\W" = whitespace)
                data = re.sub("\W"," ",data)
                self.__data.append(data)


        def getData(self):
            """getData()
            returns the class attribute self.__data"""

            return self.__data



I'm not the best programmer around, but I found it really easy to build
this. Hope this gets ya started!

    yours

        Bas

-- 
Bas van Gils <bas.vangils at home.nl>   -    http://members.home.nl/bas.vangils
Build a system that even a fool can use, and only a fool will want to use it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 240 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20020109/4bbe9e02/attachment.sig>


More information about the Python-list mailing list