Error in processing JSON files in Python
MRAB
python at mrabarnett.plus.com
Mon Mar 30 19:18:06 EDT 2015
On 2015-03-30 22:27, Karthik Sharma wrote:
> I have the following python program to read a set of JSON files do some processing on it and dump them back to the same folder. However When I run the below program and then try to see the output of the JSON file using
>
> `cat file.json | python -m json.tool`
>
> I get the following error
>
> `extra data: line 1 column 307 - line 1 column 852 (char 306 - 851)`
>
> What is wrong with my program?
>
> #Process 'new' events to extract more info from 'Messages'
> rootDir = '/home/s_parts'
> for dirName, subdirList, fileList in os.walk(rootDir):
> print('Found directory: %s' % dirName)
> for fname in fileList:
> fname='s_parts/'+fname
> with open(fname, 'r+') as f:
> json_data = json.load(f)
> et = json_data['Et']
> ms = json_data['Ms']
> if (event == 'a.b.c.d') or (event == 'e.f.g.h'):
> url = re.sub('.+roxy=([^& ]*).*', r'\1', ms)
> nt = re.findall(r"NT:\s*([^,)]*)",ms)[0]
> bt = re.findall(r"BT:\s*([^,)]*)",ms)[0]
> xt = re.findall(r"XT:\s*([^,)]*)",ms)[0]
> appde = ms.split('Appde:')[1].strip().split('<br>')[0]
> version = ms.split('version:')[1].strip().split('<br>')[0]
> json_data["url"] = url
> json_data["BT"] = bt
> json_data["XT"] = xt
> json_data["NT"] = nt
> json_data["Appde"] = appde
> json_data["version"] = version
> else:
> json_data["url"] = "null"
> json_data["BT"] = "null"
> json_data["XT"] = "null"
> json_data["NT"] = "null"
> json_data["Appde"] = "null"
> json_data["version"] = "null"
> json.dump(json_data,f)
>
> If I do a `file` command on the output file I get
> `s_parts/data_95: ASCII text, with very long lines, with no line terminators`
>
open(fname, 'r+') opens the file for update, json.load(f) reads from
the file, and then json.dump(json_data,f) writes back to the file,
_appending_ to it, so the file now contains the old data followed by
the new data.
Another point: "null" is a string and will be written as such. If you
actually want a null in the JSON data, then that should be None.
More information about the Python-list
mailing list