Home > Net >  How to remove a row in a CSV file while looping through each row to run a function?
How to remove a row in a CSV file while looping through each row to run a function?

Time:01-29

I have a program that will loop through a CSV file and execute a function on each line in the CSV to perform a task. Once that task his performed, I want to be able to remove that line in the CSV to be able to keep track of what was changed while the script is running. Below is the part of the code that loops through the CSV file

def migrate_repo(team_name, gh_token):
    with open('Repositories.csv', 'r') as csv_file:
        reader = csv.reader(csv_file)
        for x in reader:
            print("repository: "   str(x))
            print("1) Migrate repo")
            print("2) Skip repo")
            print("3) Exit")
            a = input("Please choose an option above: ")

The CSV looks like this but each URL is different. There are also no headers and its only in one column:

http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
http://ajdhfajdhfasdhflkjashdflkjahsdfjasdl
etc.

I want to be able to remove the row after the while loop finishes its task on that specific row and moves onto the next row.

CodePudding user response:

As @OneCriketeer pointed out, the only way to "modify" a file is to completely overwrite it with the modified data. To that end, I propose:

  1. Reading all URLs from your repository CSV into a list
  2. Process the list, keeping track of successful executions of your function
  3. Subtract your processed URLs from the original list, leaving you with unprocessed
  4. Write over your repository CSV with the list of unprocessed URLs

All that effectively deletes processed lines from the original.

I cannot run this, so there may be a few typos in it, but here's the general idea:

repositories = []
with open('Repositories.csv', newline='') as csv_file:
    reader = csv.reader(csv_file)
    repositories = list(reader)

processed = []
for repo in repositories:
    print("repository: "   repo)   # coming straight out of a CSV, values are always strings, no str() conversion required
    print("1) Migrate repo")
    print("2) Skip repo")
    print("3) Exit")
    a = input("Please choose an option above: ")
    # ... do stuff
    # ... finally:
    processed.append(repo)

# Use set() to "subtract" one list from another
unprocessed = set(repositories) - set(processed)

# unprocessed is now a set, still iterable, but convert back to a list if you like
# unprocessed = list(unprocessed)

with open('Repositories.csv', 'w', newline='') as csv_file:
    reader = csv.writer(csv_file)
    writer.writerows(unprocessed)
  •  Tags:  
  • Related