Home > Software design >  Check if condition is true and create dataframe with answers
Check if condition is true and create dataframe with answers

Time:01-24

I want to check if a columns value is above 1 and then create a new column that shows with a 1 or 0 if its true or not. But im getting a value error? i dont understand this. when i for example do

x = df[df['Close']>1]
print(x)

i get a dataframe with the rows that meets the condition, but when i try to this code below i cant get it work

import numpy as np
import pandas as pd
from numpy.random import randn

df = pd.DataFrame(randn(100,2),columns='Open Close'.split())

np.random.seed(101)

df['check'] = np.where(df[df['Close']>1]) , 1, 0

#df[df['Close']>1]


print(df)
Traceback (most recent call last):
  File "C:\Users\jeppe\PycharmProjects\pandas-ta-trial\main.py", line 32, in <module>
    df['position'] = np.where(df[df['Close']>1]) , 1, 0
  File "C:\Users\jeppe\PycharmProjects\test\pandas-ta-trial\lib\site-packages\pandas\core\frame.py", line 3612, in __setitem__
    self._set_item(key, value)
  File "C:\Users\jeppe\PycharmProjects\test\pandas-ta-trial\lib\site-packages\pandas\core\frame.py", line 3784, in _set_item
    value = self._sanitize_column(value)
  File "C:\Users\jeppe\PycharmProjects\test\pandas-ta-trial\lib\site-packages\pandas\core\frame.py", line 4509, in _sanitize_column
    com.require_length_match(value, self.index)
  File "C:\Users\jeppe\PycharmProjects\test\pandas-ta-trial\lib\site-packages\pandas\core\common.py", line 531, in require_length_match
    raise ValueError(
ValueError: Length of values (3) does not match length of index (100)

Process finished with exit code 1

CodePudding user response:

Problem of ()

Replace:

df['check'] = np.where(df[df['Close']>1]) , 1, 0
#                               THIS ---^

By:

df['check'] = np.where(df[df['Close']>1], 1, 0)
#                                MOVE HERE ---^

Another way to do it:

df['check'] = df['Close'].gt(1).astype(int)

Update with your seed:

np.random.seed(101)
df = pd.DataFrame(np.random.randn(100,2),columns=['Open', 'Close'])
df['check'] = df['Close'].gt(1).astype(int)
print(df[df['Close'] > 1])

# Output
        Open     Close  check
8   0.190794  1.978757      1
10  0.302665  1.693723      1
18 -0.116773  1.901755      1
19  0.238127  1.996652      1
27  0.558769  1.024810      1
28 -0.925874  1.862864      1
30  0.386030  2.084019      1
32  0.681209  1.035125      1
33 -0.031160  1.939932      1
36 -1.382920  1.482495      1
38  0.992573  1.192241      1
39 -1.046780  1.292765      1
43 -0.855196  1.541990      1
45 -0.568581  1.407338      1
47 -0.391157  1.028293      1
63 -0.532471  2.117727      1
64  0.197524  2.302987      1
74 -0.982776  2.231555      1
80  0.093628  1.240813      1
83 -2.736995  1.522562      1
85 -0.391089  1.743477      1
93 -0.205792  2.493990      1
  •  Tags:  
  • Related