[Tutor] python: extracting nested json object from multiple files, write to separate text files
Gary LaRose
garylarose at outlook.com
Thu Oct 3 18:57:52 EDT 2019
Thank you for you guidance.
I am attempting to extract nested json object in multiple json files and write to individual text files.
I have been able to get a non-nested element ['text'] from the json files and write to text files using:
import os, json
import glob
filelist = glob.glob('./*.json')
for fname in filelist:
FI = open(fname, 'r', encoding = 'UTF-8')
FO = open(fname.replace('json', 'txt'), 'w', encoding = 'UTF-8')
json_object = json.load(FI)
FO.write(json_object['text'])
FI.close()
FO.close()
I have set the working directory to the folder that contains the json files.
Below is example json file. For each file (2,900), I need to extract 'entities' and write to a separate text file:
{'author': 'Reuters Editorial',
'crawled': '2018-02-02T12:58:39.000+02:00',
'entities': {'locations': [{'name': 'sweden', 'sentiment': 'none'},
{'name': 'sweden', 'sentiment': 'none'},
{'name': 'gothenburg', 'sentiment': 'none'}],
'organizations': [{'name': 'reuters', 'sentiment': 'negative'},
{'name': 'skanska ab', 'sentiment': 'negative'},
{'name': 'eikon', 'sentiment': 'none'}],
'persons': [{'name': 'anna ringstrom', 'sentiment': 'none'}]},
'external_links': ['http://thomsonreuters.com/en/about-us/trust-principles.html'],
'highlightText': '',
'highlightTitle': '',
'language': 'english',
'locations': [],
'ord_in_thread': 0,
'organizations': [],
'persons': [],
'published': '2018-02-01T15:02:00.000+02:00',
'text': 'Feb 1 (Reuters) - Skanska Ab:\n'
'* SKANSKA DIVEST OFFICE BUILDINGS IN GOTHENBURG, SWEDEN, FOR ABOUT '
'SEK 1 BILLION Source text for Eikon: Further company coverage: '
'(Reporting By Anna Ringstrom)\n'
' ',
'thread': {'country': 'US',
'domain_rank': 408,
'main_image': 'https://s4.reutersmedia.net/resources_v2/images/rcom-default.png',
'participants_count': 1,
'performance_score': 0,
'published': '2018-02-01T15:02:00.000+02:00',
'replies_count': 0,
'section_title': 'Archive News & Video for Thursday, 01 Feb '
'2018 | Reuters.com',
'site': 'reuters.com',
'site_full': 'www.reuters.com',
'site_section': 'http://www.reuters.com/resources/archive/us/20180201.html',
'site_type': 'news',
'social': {'facebook': {'comments': 0, 'likes': 0, 'shares': 0},
'gplus': {'shares': 0},
'linkedin': {'shares': 0},
'pinterest': {'shares': 0},
'stumbledupon': {'shares': 0},
'vk': {'shares': 0}},
'spam_score': 0.21,
'title': 'BRIEF-Skanska sells office buildings in Sweden for '
'around 1 bln SEK',
'title_full': '',
'url': 'https://www.reuters.com/article/brief-skanska-sells-office-buildings-in/brief-skanska-sells-office-buildings-in-sweden-for-around-1-bln-sek-idUSASM000IRO',
'uuid': 'c83c8bf46fdb8d597e6c10ad16f221379c1c0705'},
'title': 'BRIEF-Skanska sells office buildings in Sweden for around 1 bln SEK',
'url': 'https://www.reuters.com/article/brief-skanska-sells-office-buildings-in/brief-skanska-sells-office-buildings-in-sweden-for-around-1-bln-sek-idUSASM000IRO',
'uuid': 'c83c8bf46fdb8d597e6c10ad16f221379c1c0705'}
More information about the Tutor
mailing list