Home > Blockchain >  Python JSON parser and filterer getting a KeyError every time
Python JSON parser and filterer getting a KeyError every time

Time:02-07

I am writing a program that is supposed to filter specific results of the JSON file content in the URL. I wanted to filter out any description, keywords, and title with the name "Andromeda". I have written the program and it does run on some pages, whereas on other pages, I keep getting errors and I do not know why. Here is my code:

from urllib.request import urlopen
import json

#Page = (number change)
url = "https://images-api.nasa.gov/search?q=galaxy&page=10"

response = urlopen(url)
data_json = json.loads(response.read())
#Filtering out the results that relate to the word
planets = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['title']]
planets1 = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['keywords']]
planets2 = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['description']]
print(planets)
print("--------------------------------------------")
print(planets1)
print("--------------------------------------------")
print(planets2)

Here is my error:

Traceback (most recent call last):
  File "C:Filter&Search.py", line 11, in <module>
    planets1 = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['keywords']]
  File "C:Filter&Search.py", line 11, in <listcomp>
    planets1 = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['keywords']]
KeyError: 'keywords'

Here is some of the content of my JSON file that I have tried to read and go through:

{
  "collection": {
    "version": "1.0",
    "href": "http://images-api.nasa.gov/search?q=galaxy&page=1",
    "items": [
      {
        "href": "https://images-assets.nasa.gov/image/PIA04921/collection.json",
        "data": [
          {
            "center": "JPL",
            "title": "Andromeda Galaxy",
            "nasa_id": "PIA04921",
            "media_type": "image",
            "keywords": [
              "Galaxy Evolution Explorer GALEX"
            ],
            "date_created": "2003-12-10T22:41:32Z",
            "description_508": "This image is from NASA Galaxy Evolution Explorer is an observation of the large galaxy in Andromeda, Messier 31. The Andromeda galaxy is the most massive in the local group of galaxies that includes our Milky Way.",
            "secondary_creator": "NASA/JPL/California Institute of Technology",
            "description": "This image is from NASA Galaxy Evolution Explorer is an observation of the large galaxy in Andromeda, Messier 31. The Andromeda galaxy is the most massive in the local group of galaxies that includes our Milky Way."
          }
        ],
        "links": [
          {
            "href": "https://images-assets.nasa.gov/image/PIA04921/PIA04921~thumb.jpg",
            "rel": "preview",
            "render": "image"
          }
        ]
      },
      {
        "href": "https://images-assets.nasa.gov/image/PIA04634/collection.json",
        "data": [
          {

CodePudding user response:

This will skip over those posts that don't have 'keywords' in the data dictionary

from urllib.request import urlopen
import json

#Page = (number change)
url = "https://images-api.nasa.gov/search?q=galaxy&page=10"

response = urlopen(url)
data_json = json.loads(response.read())
#Filtering out the results that relate to the word
planets = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['title']]
planets1 = [i for i in data_json['collection']['items'] if 'keywords' in i['data'][0] and 'Andromeda' in i['data'][0]['keywords']]
planets2 = [i for i in data_json['collection']['items'] if 'Andromeda' in i['data'][0]['description']]
print(planets)
print("--------------------------------------------")
print(planets1)
print("--------------------------------------------")
print(planets2)

You can use this to print out if the 'keywords' were not found in the dictionary

planets1 = []
for i in data_json['collection']['items']:
    if 'keywords' in i['data'][0] and 'Andromeda' in i['data'][0]['keywords']:
        planets1.append(i)
    else: 
        print('keywords not found')

if you want list comprehension you could do this:

planets1 = [i if 'keywords' in i['data'][0] and 'Andromeda' in i['data'][0]['keywords'] else print('keywords not found') for i in data_json['collection']['items']] # will generate None items (those items when it just printed out instead of when you wanted to add it with Andromeda)
planets1 = list(filter(lambda x: x, planets1)) # removes None from list
  •  Tags:  
  • Related