Home > Blockchain >  How to get last value of column from a data frame
How to get last value of column from a data frame

Time:01-15

I have a data frame like this

    ntil  ureach_x  ureach_y     awgt 
0      1         1        34  2204.25 
1      2        35        42  1700.25 
2      3        43        48   898.75 
3      4        49        53   160.25 

and an array of values like this

ulist = [41,57]

For each value in the list [41,57] I am trying to find if the values fall in between ureach_x and ureach_y and return the awgt value.

awt=[]
for u in ulist:
    for index,rows in df.iterrows():
                if (u >= rows['ureach_x'] and u <= rows['ureach_y']):
                        awt.append(rows['awgt'])

The above code works for within the value ranges of ureach_x and ureach_y. How do I check if the value in the list is greater than the last row of ureach_y. My data frame has dynamic shape with varying number of rows.

For example, The desired output for value 57 in the list is 160.25

I tried the following:

for u in ulist:
    for index,rows in df.iterrows():
                if (u >= rows['ureach_x'] and u <= rows['ureach_y']):
                        awt.append(rows['awgt'])
                elif (u >= rows['ureach_x'] and u > rows['ureach_y']):
                        awt.append(rows['awgt'])

However, this returns multiple values for 41 in the list. How do I refer only the last value in the column of reach_y in a iterrows loop.

The expected output is as follows:

for values in list: [41,57]

the corresponding values from df has to be returned. [1700.25 ,160.25]

CodePudding user response:

If I've understood correctly, you can perform a merge_asof:

s = pd.Series([41,57], name='index')
(pd.merge_asof(s, df, left_on='index', right_on='ureach_x')
   .set_index('index')['awgt']
 )

Output:

index
41    1700.25
57     160.25
Name: awgt, dtype: float64

CodePudding user response:

If you have 0 in the data and you want to have 2204.25 returned, you can add two lines to @mozway's code and perform merge_asof twice, once going backwards and once going forwards; then combine the two.

ulist = [0, 41, 57]
srs = pd.Series(ulist, name='num')
backward = pd.merge_asof(srs, df, left_on='num', right_on='ureach_x')
forward = pd.merge_asof(srs, df, left_on='num', right_on='ureach_x', direction='forward')
out = backward.combine_first(forward)['awgt']

Output:

0    2204.25
1    1700.25
2     160.25
Name: awgt, dtype: float64

Another option (an explicit loop over ulist):

out = []
for num in ulist:
    if ((df['ureach_x'] <= num) & (num <= df['ureach_y'])).any():
        x = df.loc[(df['ureach_x'] <= num) & (num <= df['ureach_y']), 'awgt'].iloc[-1]
    elif (df['ureach_x'] > num).any():
        x = df.loc[df['ureach_x'] > num, 'awgt'].iloc[0]
    else:
        x = df.loc[df['ureach_y'] < num, 'awgt'].iloc[-1]
    out.append(x)

Output:

[2204.25, 1700.25, 160.25]
  •  Tags:  
  • Related