Python how can I read file of dictionaries with newline?-CodePudding

I have a file of json objects like this

dict\n
dict\n
.
.
.

This is how I make this file

with open(old_surveys.json, 'a ') as f1:
            for survey in data:
                surv = {"sid": survey["id"],
                    "svy_ttl": survey["title"]),
                    "svy_link": survey["href"]
                    }
                f1.seek(0)
                
                if str(surv["sid"]) not in f1.read():
                    json.dump(surv, f1)
                    f1.write('\n')
            f1.close()

Now I want to check if a specific dict is in the file old_surveys.json. How can I read it line by line?

CodePudding user response：

To avoid the duplication in a more efficient way, and answering your question:

import json

with open('old_surveys.json', 'a ') as f1:
    # first load all the old surveys in a dictionary
    f1.seek(0)
    surveys = {}
    for line in f1:
        d = json.loads(line)
        surveys[d['sid']] = d
    # then write any new ones from data
    for survey in data:
        if survey['id'] not in surveys:
            json.dump({'sid': survey['id'], 'svy_ttl': survey['title'], 'svy_link': survey['href']}, f1)
            f1.write('\n')
    # this line is not needed, it closes thanks to with
    # f1.close()

Optionally, you may want to still create surv and write that to the file, as well as adding it to surveys, if you expect duplicates in data.

import json

with open('old_surveys.json', 'a ') as f1:
    f1.seek(0)
    surveys = {}
    for line in f1:
        d = json.loads(line)
        surveys[d['sid']] = d
    for survey in data:
        if survey["id"] not in surveys:
            surv = {"sid": survey["id"], "svy_ttl": survey["title"], "svy_link": survey["href"]}
            surveys[surv['id']] = surv
            json.dump(surv, f1)
            f1.write('\n')

If you don't really need the surveys, but just the identifiers, this is more efficient:

import json

with open('old_surveys.json', 'a ') as f1:
    f1.seek(0)
    surveys = set()
    for line in f1:
        d = json.loads(line)
        surveys.add(d['sid'])
    for survey in data:
        if survey["id"] not in surveys:
            surv = {"sid": survey["id"], "svy_ttl": survey["title"], "svy_link": survey["href"]}
            surveys.add(surv['id'])
            json.dump(surv, f1)
            f1.write('\n')

Here, the dictionary has been replaced with a set(), since you only need to keep track of the identifiers, but you wouldn't have access to the rest of the surveys after this section (unlike before).

CodePudding user response：

Assuming you have a file like this

{"sid": 1, "svy_ttl": "foo", "svy_link": "foo.com"}
{"sid": 2, "svy_ttl": "bar", "svy_link": "bar.com"}
{"sid": 3, "svy_ttl": "Alice", "svy_link": "alice.com"}
{"sid": 4, "svy_ttl": "Bob", "svy_link": "bob.com"}

How about this code snippet? I'm not sure this is the optimal solution tho

import json


def target_dict_exists(target_dict, filename):
    with open(filename, "r") as f:
        for line in f:
            if json.loads(line) == target_dict:
                return True
    return False


if __name__ == "__main__":
    target = {"sid": 3, "svy_ttl": "Alice", "svy_link": "alice.com"}
    print(target_dict_exists(target, "test.txt"))