Home > Net >  How can I set conditions for dataframes?
How can I set conditions for dataframes?

Time:01-31

/we.tl/t-ghXIOjPznq

Here is my xlsx file.

https://imgur.com/b8kTbNV

I have such a dataframe. I want to define only for conditions where LITHOLOGY column is 1. In order to do that;

df2 = pd.read_excel('V131BLOG.xlsx')

LITHOLOGY = [1] & 

df2[df2.LITHOLOGY.isin(LITHOLOGY)]

There hasn't been a problem so far. I was able to filter as I wanted.

https://imgur.com/wcSvokM

In addition to these, I want to see cells with LITHOLOGY column as 1 If It's thickness is bigger than 15cms. What I mean is that, the cumulative difference of consecutive cells of DEPTH_MD column should be bigger than 10cms. I have not made any progress on this. What path should I follow?

As you can see in this (https://imgur.com/a/02nlUUl) figure, there can be seen serial group of LITHOLOGY column as 1. But when you check the DEPTH_MD values, upper group is equal to 10cms, on the other side, lower group is equal 5cms. I want to create a dataframe that only contains bigger than 10cms DEPTH_MD values.

Input:

        DEPTH_MD CALIPER GR     LITHOLOGY SHALLOW   DEEP
1980    329.00  26.8964 25.47160    2   2.99103 2.62130
1981    329.05  26.8574 32.54390    2   2.94772 2.58945
1982    329.10  27.1297 28.83750    1   2.90123 2.55601
1983    329.15  26.9742 17.91150    2   2.80383 2.52327
1984    329.20  28.3946 31.94310    2   2.76041 2.49050
1985    329.25  30.9402 17.63760    1   2.71992 2.46051
1986    329.30  35.2419 17.69170    1   2.67355 2.42852
1987    329.35  37.9206 17.74620    1   2.61838 2.33619
1988    329.40  39.9189 24.84460    2   2.56200 2.28671
1989    329.45  41.4947 7.03354     2   2.50669 2.23887
1990    329.50  41.5473 7.03354     2   2.42167 2.19944
1991    329.55  41.0158 10.58260    2   2.40039 2.17235

Output except:

        DEPTH_MD CALIPER GR  LITHOLOGY SHALLOW  DEEP
1985    329.25  30.9402 17.6376 1   2.71992 2.46051
1986    329.30  35.2419 17.6917 1   2.67355 2.42852
1987    329.35  37.9206 17.7462 1   2.61838 2.33619

CodePudding user response:

Group the consecutive 'LITHOLOGY' rows then compute the thickness and finally broadcast to all rows:

df['THICKNESS'] = (
    df.groupby(df['LITHOLOGY'].ne(df['LITHOLOGY'].shift()).cumsum())['DEPTH_MD'] 
      .transform(lambda x: x.diff().sum())
)

out = df[(df['LITHOLOGY'] == 1) & (df['THICKNESS'] >= 0.1)]

Output:

>>> out
      DEPTH_MD  CALIPER       GR  LITHOLOGY  SHALLOW     DEEP  THICKNESS
1985    329.25  30.9402  17.6376          1  2.71992  2.46051        0.1
1986    329.30  35.2419  17.6917          1  2.67355  2.42852        0.1
1987    329.35  37.9206  17.7462          1  2.61838  2.33619        0.1
  •  Tags:  
  • Related