I am trying to convert below JSON format text to pandas or spark data frame, but it is giving below error.
ERROR: JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Python CODE:
# import pandas to read json file
import json
path = "sample.json"
with open(path, 'r') as myfile:
data=myfile.read()
data = data.replace('\t','')
data = data.replace('\n','')
data = data.replace(',}','}')
data = data.replace(',]',']')
obj = json.loads(data)
JSON file format
Output of data after reading .json file by using open function
How can I convert above text as a data frame?
CodePudding user response:
I got, I added few lines of code
path = "sample.json"
with open(path, 'r') as myfile:
data=myfile.read()
data = data.replace('\t','')
data = data.replace('\n','')
data = data.replace(',}','}')
data = data.replace(',]',']')
data = data.replace("null", "''")
liss = []
data1 = data[1:-1]
data2 = data1.split("},")
for i in data2:
last_value = i[len(i)-1]
if last_value != "}":
new_text = i "}"
liss.append(new_text)
else:
new_text = i
liss.append(new_text)
sample_df = pd.DataFrame({"Col1":liss})
sample_df["Col1"] = sample_df["Col1"].apply(lambda x : dict(eval(x)) )
df3 = sample_df["Col1"].apply(pd.Series )
df3
CodePudding user response:
I think you can read the json and save it in a dictionary. Once you have this dictionary you can create a spark dataframe with the following line of code
df = spark.createDataFrame(dict)


