I am trying to append a pandas df which is outside of a function. Here, I want to append df2 (inside the function) with df (is located outside of the function).
import pandas as pd
df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'), index=['x', 'y'])
def test(df_t):
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'), index=['x', 'y'])
df = df.append(df2)
print(df)
test(df)
I am getting UnboundLocalError: local variable 'df' referenced before assignment error (and that is expected because of the variable scope).
I have gone through this post. But, the only one answer of this post suggested append outside of the function (though df is declared inside of the function). However, I need to declare the df outside of the function and need to append with df2 inside the function.
If I try df.append(df2) instead of df = df.append(df2), program is not giving any error but getting only df as output (without append).
CodePudding user response:
Your issues is the below line.
df = df.append(df2)
In python, everything declared in a block, such as a function body, is local to that block, unless you use the special keywords global or nonlocal.
def test(df_t):
global df
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'), index=['x', 'y'])
df = df.append(df2)
print(df)
Apart from that, the function and the namings are a bit messy, IMO. If you can, you should avoid such global declarations, and pass the dataframe into the function as your signature already suggests.
Using globals, is not great because your function has side effects. You can never reliably predict its behaviour since it depends on something from the outside. If that something outside changes, your function behaves differently, even though the caller did not change the way it called the function.
CodePudding user response:
df is a declared out of the function. If you want to modify it you should declare it explicitly but in this case df_t is useless.
df = pd.DataFrame([[1, 2], [3, 4]], columns=list('AB'), index=['x', 'y'])
def test(): # <- df_t is useless now
global df # HERE
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'), index=['x', 'y'])
df = df.append(df2)
print(df)
test(df)
But the suggestion of @Neither is more pertinent:
def test(df_t):
df2 = pd.DataFrame([[5, 6], [7, 8]], columns=list('AB'), index=['x', 'y'])
return df_t.append(df2)
test()
Output:
A B
x 1 2
y 3 4
x 5 6
y 7 8
