[Tutor] python: extracting nested json object from multiple files, write to separate text files

Alex Kleider akleider at sonic.net
Thu Oct 3 19:55:15 EDT 2019


On 2019-10-03 15:57, Gary LaRose wrote:
> Thank you for you guidance.
> I am attempting to extract nested json object in multiple json files
> and write to individual text files.
> I have been able to get a non-nested element ['text'] from the json
> files and write to text files using:


> for fname in filelist:
>      FI = open(fname, 'r', encoding = 'UTF-8')
>      FO = open(fname.replace('json', 'txt'), 'w', encoding = 'UTF-8')
>      json_object = json.load(FI)
>      FO.write(json_object['text'])
> 
> FI.close()
> FO.close()
> 
> Below is example json file. For each file (2,900), I need to extract
> 'entities' and write to a separate text file:
> 
> {'author': 'Reuters Editorial',
> 'crawled': '2018-02-02T12:58:39.000+02:00',
> 'entities': {'locations': [{'name': 'sweden', 'sentiment': 'none'},
>                             {'name': 'sweden', 'sentiment': 'none'},
>                             {'name': 'gothenburg', 'sentiment': 
> 'none'}],
>               'organizations': [{'name': 'reuters', 'sentiment': 
> 'negative'},
>                                 {'name': 'skanska ab', 'sentiment': 
> 'negative'},
>                                 {'name': 'eikon', 'sentiment': 
> 'none'}],
>               'persons': [{'name': 'anna ringstrom', 'sentiment': 
> 'none'}]},
> 'external_links':
> ...........



Although not one of the "tutors" I'd like to take a stab at helping
you with the goal of getting some feed back as to how close I get to
the correct answer.
I suggest you replace the line "FO.write(json_object['text'])" with
the following to get what I think you want:

     ret = []
     ret.append("Text component")
     ret.append("==============")
     for line in json_object["text"]:
         # I assume you want the "text" component as well;
         # if not, delete this for loop and the two lines above it.
         ret.append(line)
     ret.append("")
     ret.append("entities")
     ret.append("========")
     for key in json_object['entities']:
         ret.append(key)
         ret.append("-" * len(key))
         for record in json_object[key]:
             ret.append("{name}: {sentiment}".format(**record))
     FO.write("\n".join(ret))

I've not tested.


More information about the Tutor mailing list