parsing json output

Paul McGuire ptmcg at austin.rr.com
Mon Mar 24 09:33:11 EDT 2008


On Mar 18, 9:10 pm, Gowri <gowr... at gmail.com> wrote:
> Hi,
>
> I have a service running somewhere which gives me JSON data. What I do
> is this:
>
> import urllib,urllib2
> import cjson
>
> url = 'http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/
> tbedi/requestDetails'
> params = {'format':'json'}
> eparams = urllib.urlencode(params)
> request = urllib2.Request(url,eparams)
> response = urllib2.urlopen(request)    # This request is sent in HTTP
> POST
> print response.read()
>
> This prints a whole bunch of nonsense as expected. I use cjson and am
> unable to figure out how to print this json response and I guess if I
> can do this, parsing should be straightforward?
<snip>

Gowri -

On a lark, I tried using the JSON parser that ships with the examples
in pyparsing (also available online at http://pyparsing.wikispaces.com/space/showimage/jsonParser.py).
The parsed data returned by pyparsing gives you a results object that
supports an attribute-style access to the individual fields of the
JSON object.  (Note: this parser only reads, it does not write out
JSON.)

Here is the code to use the pyparsing JSON parser (after downloading
pyparsing and the jsonParser.py example), tacked on to your previously-
posted code to retrieve the JSON data in variable 's':


from jsonParser import jsonObject
data = jsonObject.parseString(s)

# dump out listing of object and attributes
print data.dump()
print

# printe out specific attributes
print data.phedex.call_time
print data.phedex.instance
print data.phedex.request_call

# access an array of request objects
print len(data.phedex.request)
for req in data.phedex.request:
    #~ print req.dump()
    print "-", req.id, req.last_update


This prints out (long lines clipped with '...'):

[['phedex', [['request', [[['last_update', '1188037561'], ...
- phedex: [['request', [[['last_update', '1188037561'],
['numofapproved', '1'],...
  - call_time: 0.10059
  - instance: tbedi
  - request: [[['last_update', '1188037561'], ['numofapproved',
'1'], ...
  - request_call: requestDetails
  - request_date: 2008-03-24 12:56:32 UTC
  - request_timestamp: 1206363392.09
  - request_url: http://cmsdoc.cern.ch/cms/test/aprom/phedex/dev/gowri/datasvc/tbedi/requestDetails?format=json

0.10059
tbedi
requestDetails
1884
- 7425 1188037561
- 8041 1188751826
- 9281 1190116795
- 9521 1190248781
- 12821 1192615612
- 13121 1192729887
...

The dump() method is a quick way to see what keys are defined in the
output object, and from the code you can see how to nest the
attributes following the nesting in the dump() output.

Pyparsing is pure Python, so it is quite portable, and works with
Python 2.3.1 and up (I ran this example with 2.5.1).

You can find out more at http://pyparsing.wikispaces.com.

-- Paul



More information about the Python-list mailing list