OrderedDict
silver0346 at gmail.com
silver0346 at gmail.com
Fri May 20 01:15:19 EDT 2016
On Wednesday, May 18, 2016 at 2:25:16 PM UTC+2, Peter Otten wrote:
> Chris Angelico wrote:
>
> > On Wed, May 18, 2016 at 7:28 PM, Peter Otten <__peter__ at web.de> wrote:
> >> I don't see an official way to pass a custom dict type to the library,
> >> but if you are not afraid to change its source code the following patch
> >> will allow you to access the value of dictionaries with a single entry as
> >> d[0]:
> >>
> >> $ diff -u py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> >> py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> >> --- py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> >> 2016-05-18 11:18:44.000000000 +0200
> >> +++ py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py
> >> 2016-05-18 11:11:13.417665697 +0200 @@ -35,6 +35,13 @@
> >> __version__ = '0.10.1'
> >> __license__ = 'MIT'
> >>
> >> +_OrderedDict = OrderedDict
> >> +class OrderedDict(_OrderedDict):
> >> + def __getitem__(self, key):
> >> + if key == 0:
> >> + [result] = self.values()
> >> + return result
> >> + return _OrderedDict.__getitem__(self, key)
> >>
> >> class ParsingInterrupted(Exception):
> >> pass
> >
> > Easier than patching might be monkeypatching.
> >
> > class OrderedDict(OrderedDict):
> > ... getitem code as above ...
> > xmltodict.OrderedDict = OrderedDict
> >
> > Try it, see if it works.
>
> It turns out I was wrong on (at least) two accounts:
>
> - xmltodict does offer a way to specify the dict type
> - the proposed dict implementation will not solve the OP's problem
>
> Here is an improved fix which should work:
>
>
> $ cat sample.xml
> <?xml version="1.0" encoding="utf-8" ?>
> <profiles>
> <profile id='visio02' revision='2015051501' >
> <package package-id='0964-gpg4win' />
> </profile>
> </profiles>
> $ cat sample2.xml
> <?xml version="1.0" encoding="utf-8" ?>
> <profiles>
> <profile id='visio02' revision='2015051501' >
> <package package-id='0964-gpg4win' />
> <package package-id='0965-gpg4win' />
> </profile>
> </profiles>
> $ cat demo.py
> import collections
> import sys
> import xmltodict
>
>
> class MyOrderedDict(collections.OrderedDict):
> def __getitem__(self, key):
> if key == 0 and len(self) == 1:
> return self
> return super(MyOrderedDict, self).__getitem__(key)
>
>
> def main():
> filename = sys.argv[1]
> with open(filename) as f:
> doc = xmltodict.parse(f.read(), dict_constructor=MyOrderedDict)
>
> print "doc:\n{}\n".format(doc)
> print "package-id: {}".format(
> doc['profiles']['profile']['package'][0]['@package-id'])
>
>
> if __name__ == "__main__":
> main()
> $ python demo.py sample.xml
> doc:
> MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile',
> MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'),
> (u'package', MyOrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))])
>
> package-id: 0964-gpg4win
> $ python demo.py sample2.xml
> doc:
> MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile',
> MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'),
> (u'package', [MyOrderedDict([(u'@package-id', u'0964-gpg4win')]),
> MyOrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))])
>
> package-id: 0964-gpg4win
I have tested the first solution. Works nice. Before I used xml.etree to parse 2000 xml files.
Execution time decrease from more then 5 min to 20 sec. Great. On weekend I will test the solution with the own class.
Many thanks.
More information about the Python-list
mailing list