Home > database >  pandas apply function that returns a DataFrame
pandas apply function that returns a DataFrame

Time:02-02

Say I have a DataFrame (or Series) of arguments, and a function f which takes those arguments and returns a DataFrame.

e.g.

arguments = pd.DataFrame({"a": [2, 3], "b": [10, 100]})

df = pd.DataFrame({"x": [1, 0, 0], "y": [0, 1, 0], "z": [0, 0, 1]})
def f(a, b):
    return df*a*b

I want to get a DataFrame that stacks the DataFrames obtained from applying f to the arguments in each row of arguments:

     x    y    z
0   20    0    0
1    0   20    0
2    0    0   20
0  300    0    0
1    0  300    0
2    0    0  300

I can achieve this by explicitly constructing the result as follows...

pd.concat(f(a=row["a"], b=row["b"]) for _, row in arguments.iterrows())

...but as this is basically just an apply for a function that returns DataFrames, I was wondering if there's a pandas method for doing it.

CodePudding user response:

Maybe you can convert the DataFrames to numpy arrays and use elementwise multiplication:

multiplier = np.kron(arguments['a'].mul(arguments['b']).to_numpy(), np.ones(([*df.shape]), dtype=int)).T
pd.DataFrame(np.tile(df.to_numpy(), (len(arguments), 1)) * multiplier, columns=df.columns)

Output:

     x    y    z
0   20    0    0
1    0   20    0
2    0    0   20
3  300    0    0
4    0  300    0
5    0    0  300

CodePudding user response:

I think your solution is good. It's also very readable, which is important. The only thing I'd change is the way you're looping over the dataframe and passing the arguments.

Instead of user iterrows, you can use to_dict('series'), which will return a list of dicts that can be nicely expanding as keyword arguments to f:

df = pd.concat(f(**args) for args in arguments.to_dict('records'))

Output:

>>> df
     x    y    z
0   20    0    0
1    0   20    0
2    0    0   20
0  300    0    0
1    0  300    0
2    0    0  300
  •  Tags:  
  • Related