While turning around the iteration, if the conditions are satisfied then the value is changed. However, the original dataframe remains unchanged. Is there a way to solve this?
(I know itertuples, iterrows loc can available. But I want to use values. (more faster))
import panda as pd
df = pd.read_csv(filename)
for value in df.values:
if A:
value[2] = 3
print(value) # changed
df.to_csv(newfilename) # unchanged
CodePudding user response:
The CSV should also be changed. I just tested this and it changed:
import pandas as pd
df = pd.DataFrame({
'A': [0,1,0,0,1,1,0,1,0],
'B': [1,0,1,1,0,0,1,0,1],
})
for value in df.values:
if value[0]==0:
value[1]=5
print(value) # changed
df #also changed
df.to_excel("output.xlsx") #also changed
CodePudding user response:
As mozway suggested, try vectorized code. This is generally a good thing when you work with pandas.
df.loc[CONDITION, COLUMN_NAME_TO_WRITE_TO] = NEW_VALUE
for your example, maybe something like
import panda as pd
df = pd.read_csv(filename)
df.loc[A, df.columns[2]] = 3 # instead of df.columns[INDEX] you could directly use the column name
df.to_csv(newfilename)
