Home > Net >  How to read JSON file containing multiple dictionaries with boto3
How to read JSON file containing multiple dictionaries with boto3

Time:01-07

I have several JSON files containing multiple dictionaries stored in S3. I need to access each line and rename some of the keys. I have written the code in my local environment which works flawlessly, but I run into issues using Lambda. Usually, I get an Expecting property name enclosed in double quotes error.

Example JSON:

{
 "request": 123,
 "key1": [
   {
    "timestamp_unix": 98321,
    "key_2": "Portugal"
   }
  ]
}
{
 "request": 456,
 "key1": [
   {
    "timestamp_unix": 35765,
    "key_2": "China"
   }
  ]
}

Local code:

import json

with open("myfile.json", "r") as f:
    my_file = [json.loads(line) for line in f]
for j in my_file:
    j[key1][0][key2] = j[key1][0].pop("key_2")

AWS code:

import boto3
import json

s3 = boto3.resource("s3")

obj = s3.Object("my-bucket", "path_to/myfile.json")
json_string = obj.get()["Body"].read().decode("utf-8") # this is where my json object is read in with single quotes instead of double quotes
my_file = [json.loads(line) for line in json_string] # error error error

I also tried:

import boto3
import json

s3_client = boto3.client("s3")

obj = s3_client.get_object(Bucket="my-bucket", Key="path_to/myfile.json")
json_string = obj["Body"].read().decode() # this is where my json object is read in with single quotes instead of double quotes
my_file = [json.loads(line) for line in json_string] # error error error

I removed the encode() option altogether, but this didn't work either. I don't want to/can't change the underlying json files and store the dicts in a list.

How can I read in json files with multiple dictionaries with boto3?

CodePudding user response:

The boto3 equivalent of for line in f: is to use the iter_lines() method.

lines = obj.get()["Body"]
my_file = [json.loads(line) for line in lines.iter_lines()]
  •  Tags:  
  • Related