Home > Back-end >  How to iterate through csv rows, apply a function to those values and append to new column?
How to iterate through csv rows, apply a function to those values and append to new column?

Time:01-18

I have a python script which calculates tree heights based off distance and angle from the ground, however, despite the script running with no errors my heights column is left empty. Also, I dont want to be using pandas and I would like to keep to the 'with open' method if possible, before anyone suggests going about it a different way. Any help would be great thanks. It seems that the whole script runs fine and does everything i need it to until the "for row in csvread:" block. This is my current script:

#!/usr/bin/env python3
# Import any modules needed
import sys
import csv
import math
import os
import itertools

# Extract command line arguments, remove file extension and attach to output_filename
input_filename1 = sys.argv[1]
input_filename2 = os.path.splitext(input_filename1)[0]
filenames = (input_filename2, "treeheights.csv")
output_filename = "".join(filenames)

def TreeHeight(degrees, distance):
    """
    This function calculates the heights of trees given distance 
    of each tree from its base and angle to its top, using the 
    trigonometric formula.
    """
    radians = math.radians(degrees)
    height = distance * math.tan(radians)
    print("Tree height is:", height)
    
    return height

def main(argv):
    with open(input_filename1, 'r') as f:
        with open(output_filename, 'w') as g:
    
            csvread = csv.reader(f)
            print(csvread)
            csvwrite = csv.writer(g)
    
            header = csvread.__next__()
            header.append("Height.m")
            csvwrite.writerow(header)
    
            # Populating the output csv with the input data
            csvwrite.writerows(itertools.islice(csvread, 0, 121))
    
            for row in csvread:
                height = TreeHeight(csvread[:,2], csvread[:,1])
                row.append(height)
                csvwrite.writerow(row)
return 0

if __name__ == "__main__":
    status = main(sys.argv)
    sys.exit(status)

CodePudding user response:

Looking at your code, I think you're mostly there, but are a little confused on reading/writing rows:

# Populating the output csv with the input data
csvwrite.writerows(itertools.islice(csvread, 0, 121))

for row in csvread:
    height = TreeHeight(csvread[:,2], csvread[:,1])
    row.append(height)
    csvwrite.writerow(row)

It looks like your reading rows 1 through 121 and writing them to your new file. Then, you're trying to iterate over your CSV reader in a second pass, compute the height, and then tack that computed value on to the end of the row, and also write to your CSV in a complete second pass.

If that's true, then you need to understand that CSV reader and writer are not designed to work "left-to-right" like that: read-write these columns, then read-write these columns... nope.

They both work "top-down", processing rows.

I propose, to get this working: iterate every row in one loop, and for every row:

  • read the values you need from row to compute the height
  • get the computed height
  • add the new computed to the original
  • write
...

header = next(csvread)
header.append("Height.m")
csvwrite.writerow(header)

for row in csvread:
    degrees = float(row[1])   # second column for degrees?
    distance = float(row[0])  # first column for distance?
    height = TreeHeight(degrees, distance)
    row.append(height)
    csvwrite.writerow(row)

Some changes I made:

  • I replaced header = csvread.__next__() with header = next(csvread). Calling things that start with _ or __ is generally discouraged, at least in the standard library. next(<iterator>) is the built-in function that allows you to properly and safely advance through <iterator>.
  • Added float() conversion to textual values as read from CSV

Also, as far as I can tell, the ,2/,1 is incorrect syntax for subscripting/slice notation. You didn't get any errors because the reader was already done/exhausted from the islice() call, so your program never actually stepped into the for row in csvread: loop.

  •  Tags:  
  • Related