have got input dataframe like below and a variable 'extra'
input_df
Item Space refill Min Max
Apple 0.25 12.53 0.125 0.25
Lemon 0.25 11.2 0.25 0.375
Potato 0.375 10.3 0.375 0.75
Melon 0.375 9.2 0.25 0.75
'extra' variable can be negative or positive.
Expected output_df if extra = 0.25
(this extra 0.25 space need to be splitted in the multiples of 0.125 and added to items with higher refill order. In this case, 0.125 0.125 = 0.25 so two items space will be updated. Also should consider Max value here, on adding space it should not exceed Max value, if happens skip that item and move to next.)
Item Space refill Min Max
Apple 0.25 12.53 0.125 0.25 #item skipped since it exceeds Max limit if extra space added
Lemon 0.375 11.2 0.25 0.375 #0.125 from extra space added here
Potato 0.5 10.3 0.375 0.75 #remaining 0.125 from extra space added here
Melon 0.375 9.2 0.25 0.75 #no more space to be added
Expected output_df if extra = -0.25
(this extra -0.25 space[which is two times of -0.125] need to be reduced from items with higher refill. Also should consider Min Value here, on reducing space it should not go below Min value, if happens skip that item and move to next.)
Item Space refill Min Max
Apple 0.125 12.53 0.125 0.25 #-0.125 from extra space reduced here
Lemon 0.25 11.2 0.25 0.375 #item skipped since it doesn't satisfy Min limit if reduced
Potato 0.375 10.3 0.375 0.75 #item skipped since it doesn't satisfy Min limit if reduced
Melon 0.25 9.2 0.25 0.75 #rem -0.125 from extra space reduced here
Any help will be appreciated. Thank You!
CodePudding user response:
Assumptions:
- Both
extraand the values in columnsMinandMaxshould always contain multiples of0.125. E.g.df.Min % 0.125should always lead to a series of zeros. - The
dfindex contains unique values only. If this is not yet the case, usedf.reset_index(drop=True, inplace=True)to reset it. Or usedrop=False, if you want to keep the original index values. You can then reset them afterwards, usingdf.set_index('index', drop=True, inplace=True).
Given the assumptions, the following should achieve this:
import pandas as pd
import numpy as np
data = {'Item': {0: 'Apple', 1: 'Lemon', 2: 'Potato', 3: 'Melon'},
'Space': {0: 0.25, 1: 0.25, 2: 0.375, 3: 0.375},
'refill': {0: 12.53, 1: 11.2, 2: 10.3, 3: 9.2},
'Min': {0: 0.125, 1: 0.25, 2: 0.375, 3: 0.25},
'Max': {0: 0.25, 1: 0.375, 2: 0.75, 3: 0.75}}
def f(df, extra):
length = int(abs(extra/0.125))
repeats = np.repeat([0.125 * -1 if np.signbit(extra) else 0.125], len(df))
tmp = np.where((pd.Series(repeats).lt(0) & df.Space.gt(df.Min)) |
(pd.Series(repeats).gt(0) & df.Space.lt(df.Max)),
df.Space.add(repeats), df.Space)
new_series = np.where(df.Space.ne(tmp).cumsum().gt(length), df.Space, tmp)
return new_series
Function applied:
df = pd.DataFrame(data)
df.Space = f(df, 0.25)
print(df)
Item Space refill Min Max
0 Apple 0.250 12.53 0.125 0.250
1 Lemon 0.375 11.20 0.250 0.375
2 Potato 0.500 10.30 0.375 0.750
3 Melon 0.375 9.20 0.250 0.750
df = pd.DataFrame(data)
df.Space = f(df, -0.25)
print(df)
Item Space refill Min Max
0 Apple 0.125 12.53 0.125 0.250
1 Lemon 0.250 11.20 0.250 0.375
2 Potato 0.375 10.30 0.375 0.750
3 Melon 0.250 9.20 0.250 0.750
Suppose we use extra= 0.375, then the last item will also change:
df = pd.DataFrame(data)
df.Space = f(df, 0.375)
print(df)
Item Space refill Min Max
0 Apple 0.250 12.53 0.125 0.250
1 Lemon 0.375 11.20 0.250 0.375
2 Potato 0.500 10.30 0.375 0.750
3 Melon 0.500 9.20 0.250 0.750 # also changed
Explanation:
- Inside function
f, we first create annp.arraynamedrepeatswith a length equal to yourdf, filled either with-0.125or0.125, depending on the sign ofextra(usingnp.signbit). - Next, we use
np.whereand check either (values in columnSpace< 0 and > values in columnMin) or (values in columnSpace> 0 and < values in columnMax). For eachTruewe wantdf.Space.add(repeats), for eachFalsewe keep the values fromdf.Space. We store the result in a variabletmp. SeeSeries.ltandSeries.gt. - Finally, we apply
np.whereagain to limit the amount of changes to be made. In order to figure this out, we first applySeries.netotmpand getTruefor each mismatch (i.e. a change). ApplyingSeries.cumsumwill get us a cumulative sum of all theTruevalues. Once, a value inside this series becomes greater thanint(abs(extra/0.125))we need to stop making changes. So, from that point onwards, we fall back on the original values in columnSpace.
