Home > Software design >  Utilizing Pandas functionality/making code more pythonic to rewrite an excel macro
Utilizing Pandas functionality/making code more pythonic to rewrite an excel macro

Time:02-03

so I am rewriting an excel macro, and while I have successfully recreated it; I want to redo my code because it is a bit simplistic and is more of a simpler/brute force approach to the solution. I realized I am not really utilizing pandas functionality to it's fullest extent (am new to pandas, not to python). I also didn't have the most time to develop this so I just went.

There are a solid 30 formulas I need to recreate and they are basically all if statements nested within this starting for loop (these aren't real variable names):

for row in range(len(dataframe)):
    if dataframe.iat[row, 8][12:-2] == "some string":
        dataframe.iat[row, 4] = "some other string"
        
    # convert float to int into a string
    elif dataframe.iat[row, 8][10:-4] == str(int(dataframe.iat[row, 13]))[2:]:
        dataframe.iat[row, 4] = "some string2"
    else:
        dataframe.iat[row, 4] = "some string3"

Example of this code in VBA form:

=IF(MID(I2,13,2)="some string","some other string",(IF(RIGHT(N2,2)=MID(I2,11,2),"some string2","some string3")))

One more example (trust me I know using this many 'or's is not really pythonic at all - it will be fixed):

    if dataframe.iat[row, 27] != 0 or dataframe.iat[row, 28] != 0 or dataframe.iat[row, 29] 
              != 0 or dataframe.iat[row, 25] != 0 or dataframe.iat[row, 30] != 0:
        dataframe.iat[row, 23] = "Yes"
    else:
        dataframe.iat[row, 23] = "No"

The biggest problem I am facing is when I have to access multiple columns and it is probably due to my lack of pandas experience.

CodePudding user response:

It is tricky to give really solid answers without example data, however, I think .loc can solve most of this.

Example:

dataframe.loc[(dataframe["Column 8's name"].str[12:-2] == "some string"), "Column 4s name"] = "some other string"
dataframe.loc[(dataframe["Column 8's name"].str[10:-4] == dataframe["Column 13's name"].map(int).map(str).str[2:), "Column 4s name"] = "some string2"

# then finally just fill 
dataframe["Column 4s name"] = dataframe["Column 4s name"].fillna("some string3")

.loc is a much faster method than iterating through the rows, as it uses vectorization. I have also included the method .map, which replaces the built in str() and int() method.

  •  Tags:  
  • Related