Home > Blockchain >  Converting varying JSON objects to CSV-file
Converting varying JSON objects to CSV-file

Time:02-08

I am calling an API and it returns JSON string. I want to convert it to CSV-format so I can save it later to database. However, JSON objects keys cause problems because there is keys missing or keys are changing. I wrote this python script but because of keys I cannot get it to work:

import json
import csv

with open('custom.json') as json_file:
    data = json.load(json_file)

custom_data = data['CustomJSON']
data_file = open('data_file.csv', 'w')
csv_writer = csv.writer(data_file)
count = 0

for i in custom_data:
    if count == 0:
        # Writing headers of CSV file
        header = i.keys()
        csv_writer.writerow(header)
        count  = 1
    # Writing data of CSV file
    csv_writer.writerow(i.values())
data_file.close()

How I can convert this type of JSON to CSV? Example JSON message:

{
    "CustomJSON" : [
    {
     "id" : "1,
      "name" : "Jack",
      "surname" : "Bauer"
    },
    {
      "id" : "2",
      "name" : "John",
      "surname" : "Smith"
      "age" : "31",
      "city" : "New York"
    },
    {
      "id" : "3",
      "name" : "Matt",
      "surname" : "Secret"
      "exception_1" : "Exception_1",
      "exception_2" : "Exception_2"
      "date" : "2022-02-08"
    }
  ]
}

Should I try to loop all key-values first somehow and then later try to add data? Can anyone provide an example?

CodePudding user response:

As you are reading a single JSON string, you will have everything in memory. So IMHO the simplest way is to first build the field names list, and then write everything to a csv file.

# compute the fieldnamelist
# this uses a dict because it is easy to update it while maintaining key order
keys = dict()
for d in data['CustomJSON']:
    keys.update(d)

# write to the csv file
# this uses a DictWriter because the individual rows are already dicts
with open('data_file.csv', 'w', newline='') as data_file
    csv_writer = csv.DictWriter(data_file, fieldnames = keys.keys())
    _ = csv_writer.writeheader()
    _ = csv_writer.writerows(data['CustomJSON'])

With your data it gives as expected:

id,name,surname,age,city,exception_1,exception_2,date
1,Jack,Bauer,,,,,
2,John,Smith,31,New York,,,
3,Matt,Secret,,,Exception_1,Exception_2,2022-02-08

CodePudding user response:

I am a pandas fan(atic) so I'd do something like

import pandas as pd

# df is a pandas dataframe)
df = pd.read_json('http://data.com/foo')
df.to_csv('foo.csv')

Pandas has options for the CSV dialect, if need be. You should be able to do what you describe with those two function calls, though.

  •  Tags:  
  • Related