How do I split each record in a column to new columns using the the key as column namesin Panda-CodePudding

df = pd.read_csv("sample.csv")

print(df)

ID	response	Date
234	{"statusCode":"00","statusDescription":"Successful","mRecord":"202111105530685","tranxReference":"012021116029","recipient":"09131976","amount":"1500","Code":"202505651505637179","network":"MVP","tranxDate":"14-11-2021 10:50 am"}	2021-11-14 10:50:55
235	{"statusCode":"00","statusDescription":"Successful","mRecord":"2021111496980","tranxReference":"01202111057048","recipient":"091598","amount":"1500","confirmCode":"D211114.1050040","network":"MVP","tranxDate":"14-11-2021 10:50 am"}	2021-11-14 10:50:56

I want to split all the records on "response" columns columns in another dataframe and all the keys will for the column header

CodePudding user response：

This can be done in two steps: first exploding the records in response and then concat with df after dropping the original response column. You might need to apply json.loads since your json record seems to be read as string:

import json
df_response = pd.json_normalize(df['response'].apply(json.loads))
df_out = pd.concat([df.drop('response', axis=1), df_response], axis=1)
print(df_out)

Output:

    ID                 Date statusCode statusDescription  ...                Code network            tranxDate      confirmCode
0  234  2021-11-14 10:50:55         00        Successful  ...  202505651505637179     MVP  14-11-2021 10:50 am              NaN
1  235  2021-11-14 10:50:56         00        Successful  ...                 NaN     MVP  14-11-2021 10:50 am  D211114.1050040